llama.cpp

History

Daniel Bevenius 79f7d4351d completion : add replying of session state This commit updates the session handing in the completion tool to handle the that logits are no longer stored in the session file. Instead, we need to replay the last token to get the logits for sampling.		2026-02-16 08:43:29 +01:00
..
batched-bench	tool/ex/tests: consistently free ctx, then model (#18168 )	2025-12-22 11:00:37 +01:00
cli	support --verbose-prompt (#19576 )	2026-02-13 12:49:10 +01:00
completion	completion : add replying of session state	2026-02-16 08:43:29 +01:00
cvector-generator	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
export-lora	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
fit-params	llama-fit-params: keep explicit --ctx-size 0 (#19070 )	2026-01-24 22:13:08 +01:00
gguf-split	cli: new CLI experience (#17824 )	2025-12-10 15:28:59 +01:00
imatrix	common : refactor common_sampler + grammar logic changes (#17937 )	2025-12-14 10:11:13 +02:00
llama-bench	Setting mmap and direct_io to false as default in llama-bench.cpp (#18841 )	2026-01-16 09:46:51 +01:00
mtmd	mtmd : Add Nemotron Nano 12B v2 VL support (#19547 )	2026-02-14 14:07:00 +01:00
perplexity	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
quantize	llama-quantize : cleanup `--help` output (#19317 )	2026-02-08 09:22:38 +02:00
rpc	NetBSD build support (#19589 )	2026-02-14 09:47:01 +01:00
server	build : remove LLAMA_HTTPLIB option (#19623 )	2026-02-15 15:38:50 +01:00
tokenize	cmake : Do not install tools on iOS targets (#15903 )	2025-09-16 09:54:44 +07:00
tts	model : fix wavtokenizer embedding notions (#19479 )	2026-02-11 07:52:20 +02:00
CMakeLists.txt	cmake: only build cli when server is enabled (#18670 )	2026-01-09 16:43:26 +01:00