llama.cpp

Commit Graph

Author	SHA1	Message	Date
ddh0	660a3b275f	Merge branch 'ggml-org:master' into power-law-sampler	2026-01-02 17:03:45 -06:00
HelloKS	f4f5019254	model: add Solar Open model (#18511 ) * model: add Solar-Open model * vocab: add solar-open to end eog blacklist * model: add proper llm type * chat: basic template for solar open * typo: fix comment about vocab * convert: sugested changes * convert: suggested changes * chat: change reasoning end tag for solar-open * llama-chat: add solar-open template	2026-01-01 18:01:43 +01:00
Anri Lombard	4cd162a123	chat: make tool description and parameters optional per OpenAI spec (#18478 ) * chat: make tool description and parameters optional per OpenAI spec Per the OpenAI API specification, both 'description' and 'parameters' fields in tool function definitions are optional. Previously, the parser would throw an exception if these fields were missing. Attempts to fix #17667 * refactor: use value() for cleaner optional field access	2025-12-31 17:21:37 -06:00
ddh0	eb854e73d5	minor style fixes cont.	2025-12-30 15:54:23 -06:00
ddh0	2d67b1c008	Merge branch 'ggml-org:master' into power-law-sampler	2025-12-30 13:44:42 -06:00
Aldehir Rojas	0f89d2ecf1	common : default content to an empty string (#18485 ) * common : default content to an empty string * common : fix tests that break when content != null	2025-12-30 12:00:57 -06:00
Xuan-Son Nguyen	cd78e57c3a	lora: count lora nodes in graph_max_nodes (#18469 ) * lora: count lora nodes in graph_max_nodes * 3 nodes per weight * 4 nodes * keep track n_lora_nodes from llama_model * fix assert * rm redundant header * common: load adapters before context creation * use 6 nodes	2025-12-30 15:53:12 +01:00
ddh0	05d7dc9e9a	Merge branch 'ggml-org:master' into power-law-sampler	2025-12-29 20:14:39 -06:00
o7si	daa242dfc8	common: fix return value check for setpriority (#18412 ) * common: fix return value check for setpriority * tools: add logging for process priority setting	2025-12-29 11:07:49 +02:00
ddh0	f0d3f13124	Merge branch 'ggml-org:master' into power-law-sampler	2025-12-28 20:12:16 -06:00
o7si	60f17f56da	rpc: fix segfault on invalid endpoint format (#18387 ) * rpc: fix segfault on invalid endpoint format * rpc: add error log for failed endpoint connection	2025-12-28 12:34:41 +02:00
Johannes Gäßler	026d2ad472	llama: fix magic number of 999 for GPU layers (#18266 ) * llama: fix magic number of 999 for GPU layers * use strings for -ngl, -ngld * enacapsulate n_gpu_layers, split_mode	2025-12-27 20:18:35 +01:00
ddh0	b95b0884dd	update `power-law` -> `adaptive-p`	2025-12-27 02:10:20 -06:00
ddh0	ed2890e691	Merge branch 'ggml-org:master' into power-law-sampler	2025-12-25 17:32:29 -06:00
Xuan-Son Nguyen	f5acfb2ffa	server: (router) add stop-timeout option (#18350 ) * server: (router) add stop-timeout option * also allow stop while loading * add docs * unload_lru: also wait for unload to complete	2025-12-24 23:47:49 +01:00
ddh0	10355dc7d0	common: add `LLAMA_ARG_OVERRIDE_TENSOR` env var for `-ot` arg (#18267 )	2025-12-24 14:19:12 +08:00
ddh0	6bad4aef77	Merge branch 'ggml-org:master' into power-law-sampler	2025-12-22 14:45:08 -06:00
Johannes Gäßler	147a521636	tool/ex/tests: consistently free ctx, then model (#18168 )	2025-12-22 11:00:37 +01:00
ddh0	89ebdf00c2	Merge branch 'ggml-org:master' into power-law-sampler	2025-12-20 22:36:35 -06:00
Aldehir Rojas	9496bbb808	common : reorganize includes to prioritize vendored deps (#18222 )	2025-12-20 21:43:21 -06:00
Xuan-Son Nguyen	ddcb75dd8a	server: add auto-sleep after N seconds of idle (#18228 ) * implement sleeping at queue level * implement server-context suspend * add test * add docs * optimization: add fast path * make sure to free llama_init * nits * fix use-after-free * allow /models to be accessed during sleeping, fix use-after-free * don't allow accessing /models during sleep, it is not thread-safe * fix data race on accessing props and model_meta * small clean up * trailing whitespace * rm outdated comments	2025-12-21 02:24:42 +01:00
Xuan-Son Nguyen	9e39a1e6a9	server: support load model on startup, support preset-only options (#18206 ) * server: support autoload model, support preset-only options * add docs * load-on-startup * fix * Update common/arg.cpp Co-authored-by: Pascal <admin@serveurperso.com> --------- Co-authored-by: Pascal <admin@serveurperso.com>	2025-12-20 09:25:27 +01:00
ddh0	f4703d422c	Merge branch 'ggml-org:master' into power-law-sampler	2025-12-19 17:53:19 -06:00
Pascal	14931a826e	arg: fix order to use short form before long form (#18196 ) * arg: fix order to use short form before long form * arg: update doc * arg: update test-arg-parser * arg: address review feedback from ngxson simplified to check first.length() <= last.length() only fixed: --sampler-seq, --rerank, --draft ordering note: middle positions in 3+ arg sets are not verified * arg: update doc	2025-12-19 18:01:56 +01:00
Xuan-Son Nguyen	98c1c7a7bf	presets: refactor, allow cascade presets from different sources, add global section (#18169 ) * presets: refactor, allow cascade presets from different sources * update docs * fix neg arg handling * fix empty mmproj * also filter out server-controlled args before to_ini() * skip loading custom_models if not specified * fix unset_reserved_args * fix crash on windows	2025-12-19 12:08:20 +01:00
Xuan-Son Nguyen	8ea958d4d9	model : add ASR support for LFM2-Audio-1.5B (conformer) (#18106 ) * ASR with LFM2-Audio-1.5B * Set rope_theta * Fix comment * Remove rope_theta setting * Address PR feedback * rename functions to conformer * remove some redundant ggml_cont * fix missing tensor * add prefix "a." for conv tensors * remove redundant reshape * clean up * add test model --------- Co-authored-by: Tarek Dakhran <tarek@liquid.ai>	2025-12-19 00:18:01 +01:00
ddh0	dedbe36735	Merge branch 'ggml-org:master' into power-law-sampler	2025-12-18 14:59:16 -06:00
Xuan-Son Nguyen	4d1316c440	arg: fix ASAN error on sampler_type_names empty (#18167 )	2025-12-18 14:30:32 +01:00
ddh0	60235724cf	Merge branch 'ggml-org:master' into power-law-sampler	2025-12-17 22:07:22 -06:00
ddh0	775299892e	add `use_power_law` flag + logic, minor cleanup	2025-12-17 15:06:05 -06:00
Pascal	6ce3d85796	server: (webui) add --webui-config (#18028 ) * server/webui: add server-side WebUI config support Add CLI arguments --webui-config (inline JSON) and --webui-config-file (file path) to configure WebUI default settings from server side. Backend changes: - Parse JSON once in server_context::load_model() for performance - Cache parsed config in webui_settings member (zero overhead on /props) - Add proper error handling in router mode with try/catch - Expose webui_settings in /props endpoint for both router and child modes Frontend changes: - Add 14 configurable WebUI settings via parameter sync - Add tests for webui settings extraction - Fix subpath support with base path in API calls Addresses feedback from @ngxson and @ggerganov * server: address review feedback from ngxson * server: regenerate README with llama-gen-docs	2025-12-17 21:45:45 +01:00
Georgi Gerganov	4301e27319	common : restore grammar-based rejection sampling (#18137 ) * common : restart grammar-based rejection sampling * sampling : allow null samplers	2025-12-17 19:46:00 +02:00
Johannes Gäßler	a2c199e479	common: clarify instructions for bug reports (#18134 )	2025-12-17 18:44:13 +01:00
Pascal	487674fbb3	common: fix --override-kv to support comma-separated values (#18056 ) * common: fix --override-kv to support comma-separated values * Update common/arg.cpp Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> * common: deprecate repeated arguments, suggest comma-separated values * common: add comma escape support for --override-kv * common: optimize duplicate detection with insert().second Co-authored-by: personalmountains <46615898+personalmountains@users.noreply.github.com> * common: migrate all repeated args to comma-separated syntax --------- Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com> Co-authored-by: personalmountains <46615898+personalmountains@users.noreply.github.com>	2025-12-17 11:36:23 +02:00
TrevorS	4b2a4778f8	arg: allow -kvu flag for llama-perplexity (#18117 ) The -kvu (--kv-unified) flag is required for hellaswag and winogrande benchmarks which use coupled sequences. Without unified KV cache, these benchmarks fail with: split_equal: sequential split is not supported when there are coupled sequences in the input batch (you may need to use the -kvu flag) This change adds LLAMA_EXAMPLE_PERPLEXITY to the allowed examples for the -kvu argument, enabling its use with llama-perplexity.	2025-12-17 08:33:02 +02:00
ddh0	58aa1c6f5a	Merge branch 'ggml-org:master' into power-law-sampler	2025-12-16 13:33:03 -06:00
Xuan-Son Nguyen	7b1db3d3b7	arg: clarify auto kvu/np being set on server (#17997 ) * arg: clarify auto kvu/np being set on server * improve docs * use invalid_argument	2025-12-16 12:01:27 +01:00
Aldehir Rojas	c05aa69f32	common : add nemotron 3 parsing (#18077 ) * common : expose json-schema functionality to extract type info * common : fix peg parser negation during needs_more_input * common : add some defensive measures in constructed peg parser * common : add nemotron nano 3 support * common : add nemotron nano 3 tests * remove debug line	2025-12-16 04:05:23 -06:00
ddh0	6e66095e1f	Merge branch 'ggml-org:master' into power-law-sampler	2025-12-15 09:07:13 -06:00
Johannes Gäßler	b1f3a6e5db	llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (#16653 ) * llama: automatically fit args to free memory llama-fit-params tool * fix CI * hints for bug reports, ensure no reallocation * fix segfault with Vulkan * add llama-fit-params to CI * fix CI * fix CI * fix CI * minor adjustments * fix assignment of 1 dense layer * fix logger not being reset on model load failure * remove --n-gpu-layer hint on model load failure * fix llama-fit-params verbosity * fix edge case * fix typo [no ci]	2025-12-15 09:24:59 +01:00
ddh0	1c58e9a96a	add power law to the new `samplers` vector	2025-12-14 22:32:27 -06:00
ddh0	68543257e9	update default decay to 0.9	2025-12-14 22:03:17 -06:00
ddh0	f5d08724e7	fix bad merge my git skills are lacking	2025-12-14 21:51:59 -06:00
ddh0	36b526d768	Merge branch 'master' into power-law-sampler	2025-12-14 15:43:49 -06:00
Xuan-Son Nguyen	52392291b2	preset: handle negated arg, reverse the meaning if needed (#18041 )	2025-12-14 22:08:10 +01:00
ddh0	667b70fdac	update default decay	2025-12-14 03:41:28 -06:00
ddh0	ec54fe5f14	no, but does this?	2025-12-14 02:54:14 -06:00
Georgi Gerganov	254098a279	common : refactor common_sampler + grammar logic changes (#17937 ) * common : refactor common_sampler + grammar logic changes * tests : increase max_tokens to get needed response * batched : fix uninitialized samplers	2025-12-14 10:11:13 +02:00
ddh0	d1e5c60442	add missing values to `common_params_sampling::print()`	2025-12-13 23:26:03 -06:00
ddh0	965bcc9dc4	fix leftover `window_size`	2025-12-13 22:19:15 -06:00

1 2 3 4 5 ...

706 Commits