llama.cpp

Commit Graph

Author	SHA1	Message	Date
Xuan Son Nguyen	cbb37dd4cd	improve function args handling	2025-12-31 11:29:40 +01:00
Xuan Son Nguyen	1b213ae5e7	add placeholder for tojson	2025-12-30 21:52:47 +01:00
Xuan Son Nguyen	4479c382ce	demo: type inferrence	2025-12-30 17:26:23 +01:00
Xuan Son Nguyen	9c0fa6f810	rm workarounds	2025-12-30 16:07:23 +01:00
Xuan Son Nguyen	9e9a70f72f	more fixes	2025-12-29 15:07:18 +01:00
Xuan Son Nguyen	026730e8e3	more fix, more tests	2025-12-29 12:53:31 +01:00
Xuan Son Nguyen	1cf25734a9	more tests	2025-12-29 10:53:32 +01:00
Xuan Son Nguyen	2a31c9a30c	a lot of fixes	2025-12-29 00:38:29 +01:00
Xuan Son Nguyen	1784a57e7b	impl global_from_json	2025-12-28 23:15:48 +01:00
Xuan Son Nguyen	55fe96a9df	add jinja-value.cpp	2025-12-28 22:49:31 +01:00
Xuan Son Nguyen	c7f246e7a5	allow func to access ctx	2025-12-28 22:15:10 +01:00
Xuan Son Nguyen	adad34f64d	add filter_statement	2025-12-28 22:02:22 +01:00
Xuan Son Nguyen	9a8a45ff3b	mostly works	2025-12-28 21:32:55 +01:00
Xuan Son Nguyen	45df0c91e7	testing more templates	2025-12-28 19:50:09 +01:00
Xuan Son Nguyen	db09a7468d	fix negate test	2025-12-28 19:07:01 +01:00
Xuan Son Nguyen	acb0effa25	allow print source on exception	2025-12-28 18:45:41 +01:00
Xuan Son Nguyen	64e29a5848	add mk_stmt	2025-12-28 17:48:14 +01:00
Xuan Son Nguyen	7f17608ea4	use shared_ptr for values	2025-12-28 17:46:25 +01:00
Xuan Son Nguyen	4331e9c8e9	keyword arguments and slicing array	2025-12-28 17:23:29 +01:00
Xuan Son Nguyen	45c194622e	support binded functions	2025-12-28 15:33:14 +01:00
Xuan Son Nguyen	4ca114b095	track input string even after transformations	2025-12-28 12:48:35 +01:00
Xuan Son Nguyen	81310d29c1	render gemma tmpl ok	2025-12-28 12:04:23 +01:00
Xuan Son Nguyen	10835f2720	eval with is_user_input	2025-12-27 23:25:20 +01:00
Xuan Son Nguyen	c08f4ddf01	use mk_val	2025-12-27 22:28:54 +01:00
Xuan Son Nguyen	da7bbe5813	wip	2025-12-27 22:25:19 +01:00
Xuan Son Nguyen	7ed11f78f9	add more builtins	2025-12-27 22:10:45 +01:00
Xuan Son Nguyen	15b3dbab05	add string builtins	2025-12-27 21:52:50 +01:00
Xuan Son Nguyen	5a041e65b8	fix map object	2025-12-27 20:38:06 +01:00
Xuan Son Nguyen	d8ef00e610	bin ops works!	2025-12-27 20:16:46 +01:00
Xuan Son Nguyen	8d1e9a0d12	shadow naming	2025-12-27 16:06:23 +01:00
Xuan Son Nguyen	7ad6eb39ca	binary_expression::execute	2025-12-27 16:00:07 +01:00
Xuan Son Nguyen	8cea1ed6b0	parser ok	2025-12-27 12:55:01 +01:00
Xuan Son Nguyen	7ac8e98b28	clean up	2025-12-27 12:35:19 +01:00
Xuan Son Nguyen	a6e0ae7a85	demo	2025-12-27 12:22:34 +01:00
Xuan Son Nguyen	a35fcb00b5	add vm types	2025-12-27 12:12:07 +01:00
Xuan Son Nguyen	15b7c50e95	lexer	2025-12-25 21:08:51 +01:00
Xuan Son Nguyen	8d8030142e	jinja vm	2025-12-25 00:19:23 +01:00
Xuan-Son Nguyen	4cbafad4f0	model: support MiMo-V2-Flash (#18328 ) * mimov2: convert ok * rename mimov2 --> mimo2 * fix conversion * runnable not incorrect * use sink * add_sliding_window_pattern * add swa and per-layer n_head_kv * correct params * somewhat working * correct gating func * nits * mimo2: wire RMS eps + MoE bias + converter guards * add co-author Co-authored-by: Aaryan-Kapoor <Aaryan-Kapoor@users.noreply.github.com> * use add_rope_freq_base_swa --------- Co-authored-by: Aaryan Kapoor <aaryankapoor2006@gmail.com> Co-authored-by: Aaryan-Kapoor <Aaryan-Kapoor@users.noreply.github.com>	2025-12-24 23:07:08 +01:00
Aadeshveer Singh	c184284230	fit-params : fix race condition in fit-params output (#18276 )	2025-12-24 15:57:38 +01:00
Aman Gupta	c8a2417d7b	CUDA: experimental native mxfp4 support for blackwell (#17906 ) * CUDA: experimental native mxfp4 support for blackwell * optimize load_tiles * optimize quantize_mxfp4 * cleanup * first pass review: formatting * use interleaved layout for mma * mmq: add assert for size * use __nv_fp4x4_e2m1 * use iter_k as 512, cleanup * Use 1200 as blackwell instead of 1000 * address review comments * mmq: fix stride * quantize.cu: use reference impl of e8m0 scale * address review comments * add 120f-virtual + minor fixes --------- Co-authored-by: Aman Gupta <aman>	2025-12-24 22:28:26 +08:00
Saba Fallah	54132f1b1f	model : support for LlamaBidirectionalModel architecture (#18220 ) * model: llama-embed-nemotron * minor: python lint * changed arch-name * templated llm_build_llama to be used for both llama and llama-embed arch	2025-12-24 14:02:36 +01:00
Jeff Bolz	2a9ea2020c	vulkan: fix command buffer corruption in ggml_backend_vk_event_wait (#18302 )	2025-12-24 12:36:34 +01:00
Wang Weixuan	ce7a6dc0fc	CANN : refactor ACL graph cache (#17752 ) Move the graph property checking code into methods of LRU cache. Signed-off-by: Wang Weixuan <wangweixvan@gmail.com>	2025-12-24 17:50:24 +08:00
Jesse Ikonen	1ce0126b18	docs: Fix typos in SYCL documentation (#18269 )	2025-12-24 17:19:47 +08:00
Ruben Ortlam	7f459c98e7	vulkan: use fewer FA rows for small cache runs (#18280 )	2025-12-24 08:59:14 +01:00
TianHao324	cf2ffc02bc	CANN: Uses yarn_ramp cache in ROPE (#17725 )	2025-12-24 14:55:33 +08:00
ddh0	10355dc7d0	common: add `LLAMA_ARG_OVERRIDE_TENSOR` env var for `-ot` arg (#18267 )	2025-12-24 14:19:12 +08:00
Xuan-Son Nguyen	5ee4e43f26	server: return_progress to also report 0% processing state (#18305 )	2025-12-23 21:49:05 +01:00
Pascal	5b6c9bc0f3	webui: apply webui_settings on first load (#18223 ) * webui: apply webui_settings on first load The webui_settings from /props were not applied on initial load when default_generation_settings.params was null Now syncs whenever serverProps is available, regardless of params, works for both single-model and router modes * chore: update webui build output	2025-12-23 15:48:03 +01:00
Xuan-Son Nguyen	849d021104	server: fix crash with model not having BOS/EOS (#18321 )	2025-12-23 14:39:36 +01:00

1 2 3 4 5 ...

7571 Commits All Branches Search

7571 Commits

All Branches