llama.cpp

Commit Graph

Author	SHA1	Message	Date
Xuan Son Nguyen	adad34f64d	add filter_statement	2025-12-28 22:02:22 +01:00
Xuan Son Nguyen	9a8a45ff3b	mostly works	2025-12-28 21:32:55 +01:00
Xuan Son Nguyen	45df0c91e7	testing more templates	2025-12-28 19:50:09 +01:00
Xuan Son Nguyen	db09a7468d	fix negate test	2025-12-28 19:07:01 +01:00
Xuan Son Nguyen	acb0effa25	allow print source on exception	2025-12-28 18:45:41 +01:00
Xuan Son Nguyen	64e29a5848	add mk_stmt	2025-12-28 17:48:14 +01:00
Xuan Son Nguyen	7f17608ea4	use shared_ptr for values	2025-12-28 17:46:25 +01:00
Xuan Son Nguyen	4331e9c8e9	keyword arguments and slicing array	2025-12-28 17:23:29 +01:00
Xuan Son Nguyen	45c194622e	support binded functions	2025-12-28 15:33:14 +01:00
Xuan Son Nguyen	4ca114b095	track input string even after transformations	2025-12-28 12:48:35 +01:00
Xuan Son Nguyen	81310d29c1	render gemma tmpl ok	2025-12-28 12:04:23 +01:00
Xuan Son Nguyen	10835f2720	eval with is_user_input	2025-12-27 23:25:20 +01:00
Xuan Son Nguyen	c08f4ddf01	use mk_val	2025-12-27 22:28:54 +01:00
Xuan Son Nguyen	da7bbe5813	wip	2025-12-27 22:25:19 +01:00
Xuan Son Nguyen	7ed11f78f9	add more builtins	2025-12-27 22:10:45 +01:00
Xuan Son Nguyen	15b3dbab05	add string builtins	2025-12-27 21:52:50 +01:00
Xuan Son Nguyen	5a041e65b8	fix map object	2025-12-27 20:38:06 +01:00
Xuan Son Nguyen	d8ef00e610	bin ops works!	2025-12-27 20:16:46 +01:00
Xuan Son Nguyen	8d1e9a0d12	shadow naming	2025-12-27 16:06:23 +01:00
Xuan Son Nguyen	7ad6eb39ca	binary_expression::execute	2025-12-27 16:00:07 +01:00
Xuan Son Nguyen	8cea1ed6b0	parser ok	2025-12-27 12:55:01 +01:00
Xuan Son Nguyen	7ac8e98b28	clean up	2025-12-27 12:35:19 +01:00
Xuan Son Nguyen	a6e0ae7a85	demo	2025-12-27 12:22:34 +01:00
Xuan Son Nguyen	a35fcb00b5	add vm types	2025-12-27 12:12:07 +01:00
Xuan Son Nguyen	15b7c50e95	lexer	2025-12-25 21:08:51 +01:00
Xuan Son Nguyen	8d8030142e	jinja vm	2025-12-25 00:19:23 +01:00
ddh0	10355dc7d0	common: add `LLAMA_ARG_OVERRIDE_TENSOR` env var for `-ot` arg (#18267 )	2025-12-24 14:19:12 +08:00
Johannes Gäßler	147a521636	tool/ex/tests: consistently free ctx, then model (#18168 )	2025-12-22 11:00:37 +01:00
Aldehir Rojas	9496bbb808	common : reorganize includes to prioritize vendored deps (#18222 )	2025-12-20 21:43:21 -06:00
Xuan-Son Nguyen	ddcb75dd8a	server: add auto-sleep after N seconds of idle (#18228 ) * implement sleeping at queue level * implement server-context suspend * add test * add docs * optimization: add fast path * make sure to free llama_init * nits * fix use-after-free * allow /models to be accessed during sleeping, fix use-after-free * don't allow accessing /models during sleep, it is not thread-safe * fix data race on accessing props and model_meta * small clean up * trailing whitespace * rm outdated comments	2025-12-21 02:24:42 +01:00
Xuan-Son Nguyen	9e39a1e6a9	server: support load model on startup, support preset-only options (#18206 ) * server: support autoload model, support preset-only options * add docs * load-on-startup * fix * Update common/arg.cpp Co-authored-by: Pascal <admin@serveurperso.com> --------- Co-authored-by: Pascal <admin@serveurperso.com>	2025-12-20 09:25:27 +01:00
Pascal	14931a826e	arg: fix order to use short form before long form (#18196 ) * arg: fix order to use short form before long form * arg: update doc * arg: update test-arg-parser * arg: address review feedback from ngxson simplified to check first.length() <= last.length() only fixed: --sampler-seq, --rerank, --draft ordering note: middle positions in 3+ arg sets are not verified * arg: update doc	2025-12-19 18:01:56 +01:00
Xuan-Son Nguyen	98c1c7a7bf	presets: refactor, allow cascade presets from different sources, add global section (#18169 ) * presets: refactor, allow cascade presets from different sources * update docs * fix neg arg handling * fix empty mmproj * also filter out server-controlled args before to_ini() * skip loading custom_models if not specified * fix unset_reserved_args * fix crash on windows	2025-12-19 12:08:20 +01:00
Xuan-Son Nguyen	8ea958d4d9	model : add ASR support for LFM2-Audio-1.5B (conformer) (#18106 ) * ASR with LFM2-Audio-1.5B * Set rope_theta * Fix comment * Remove rope_theta setting * Address PR feedback * rename functions to conformer * remove some redundant ggml_cont * fix missing tensor * add prefix "a." for conv tensors * remove redundant reshape * clean up * add test model --------- Co-authored-by: Tarek Dakhran <tarek@liquid.ai>	2025-12-19 00:18:01 +01:00
Xuan-Son Nguyen	4d1316c440	arg: fix ASAN error on sampler_type_names empty (#18167 )	2025-12-18 14:30:32 +01:00
Pascal	6ce3d85796	server: (webui) add --webui-config (#18028 ) * server/webui: add server-side WebUI config support Add CLI arguments --webui-config (inline JSON) and --webui-config-file (file path) to configure WebUI default settings from server side. Backend changes: - Parse JSON once in server_context::load_model() for performance - Cache parsed config in webui_settings member (zero overhead on /props) - Add proper error handling in router mode with try/catch - Expose webui_settings in /props endpoint for both router and child modes Frontend changes: - Add 14 configurable WebUI settings via parameter sync - Add tests for webui settings extraction - Fix subpath support with base path in API calls Addresses feedback from @ngxson and @ggerganov * server: address review feedback from ngxson * server: regenerate README with llama-gen-docs	2025-12-17 21:45:45 +01:00
Georgi Gerganov	4301e27319	common : restore grammar-based rejection sampling (#18137 ) * common : restart grammar-based rejection sampling * sampling : allow null samplers	2025-12-17 19:46:00 +02:00
Johannes Gäßler	a2c199e479	common: clarify instructions for bug reports (#18134 )	2025-12-17 18:44:13 +01:00
Pascal	487674fbb3	common: fix --override-kv to support comma-separated values (#18056 ) * common: fix --override-kv to support comma-separated values * Update common/arg.cpp Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> * common: deprecate repeated arguments, suggest comma-separated values * common: add comma escape support for --override-kv * common: optimize duplicate detection with insert().second Co-authored-by: personalmountains <46615898+personalmountains@users.noreply.github.com> * common: migrate all repeated args to comma-separated syntax --------- Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> Co-authored-by: personalmountains <46615898+personalmountains@users.noreply.github.com>	2025-12-17 11:36:23 +02:00
TrevorS	4b2a4778f8	arg: allow -kvu flag for llama-perplexity (#18117 ) The -kvu (--kv-unified) flag is required for hellaswag and winogrande benchmarks which use coupled sequences. Without unified KV cache, these benchmarks fail with: split_equal: sequential split is not supported when there are coupled sequences in the input batch (you may need to use the -kvu flag) This change adds LLAMA_EXAMPLE_PERPLEXITY to the allowed examples for the -kvu argument, enabling its use with llama-perplexity.	2025-12-17 08:33:02 +02:00
Xuan-Son Nguyen	7b1db3d3b7	arg: clarify auto kvu/np being set on server (#17997 ) * arg: clarify auto kvu/np being set on server * improve docs * use invalid_argument	2025-12-16 12:01:27 +01:00
Aldehir Rojas	c05aa69f32	common : add nemotron 3 parsing (#18077 ) * common : expose json-schema functionality to extract type info * common : fix peg parser negation during needs_more_input * common : add some defensive measures in constructed peg parser * common : add nemotron nano 3 support * common : add nemotron nano 3 tests * remove debug line	2025-12-16 04:05:23 -06:00
Johannes Gäßler	b1f3a6e5db	llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (#16653 ) * llama: automatically fit args to free memory llama-fit-params tool * fix CI * hints for bug reports, ensure no reallocation * fix segfault with Vulkan * add llama-fit-params to CI * fix CI * fix CI * fix CI * minor adjustments * fix assignment of 1 dense layer * fix logger not being reset on model load failure * remove --n-gpu-layer hint on model load failure * fix llama-fit-params verbosity * fix edge case * fix typo [no ci]	2025-12-15 09:24:59 +01:00
Xuan-Son Nguyen	52392291b2	preset: handle negated arg, reverse the meaning if needed (#18041 )	2025-12-14 22:08:10 +01:00
Georgi Gerganov	254098a279	common : refactor common_sampler + grammar logic changes (#17937 ) * common : refactor common_sampler + grammar logic changes * tests : increase max_tokens to get needed response * batched : fix uninitialized samplers	2025-12-14 10:11:13 +02:00
Xuan-Son Nguyen	4d5ae24c0a	arg: fix common_params_parse not accepting negated arg (#17991 )	2025-12-13 12:53:37 +01:00
Sigbjørn Skjæret	8e4d678528	common : skip model validation when --completion-bash is requested (#17975 )	2025-12-13 08:40:50 +01:00
Sigbjørn Skjæret	2bc94e7928	add llama-completion to completion-bash executables (#17976 )	2025-12-13 08:35:50 +01:00
Xuan-Son Nguyen	380b4c984e	common: support negated args (#17919 ) * args: support negated args * update docs * fix typo * add more neg options * Apply suggestions from code review Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * rm duplicated arg * fix LLAMA_ARG_NO_HOST * add test --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>	2025-12-12 23:58:53 +01:00
Xuan-Son Nguyen	54a0fee4b7	arg: add -mm and -mmu as short form of --mmproj and --mmproj-url (#17958 ) * arg: add -mm and -mmu as short form of --mmproj and --mmproj-url * correct order * update docs	2025-12-12 14:06:06 +01:00

1 2 3 4 5 ...

688 Commits