llama.cpp/tools
Daniel Bevenius 79f7d4351d
completion : add replying of session state
This commit updates the session handing in the completion tool to handle
the that logits are no longer stored in the session file. Instead, we
need to replay the last token to get the logits for sampling.
2026-02-16 08:43:29 +01:00
..
batched-bench tool/ex/tests: consistently free ctx, then model (#18168) 2025-12-22 11:00:37 +01:00
cli support --verbose-prompt (#19576) 2026-02-13 12:49:10 +01:00
completion completion : add replying of session state 2026-02-16 08:43:29 +01:00
cvector-generator docs : Minor cleanups (#19252) 2026-02-02 08:38:55 +02:00
export-lora docs : Minor cleanups (#19252) 2026-02-02 08:38:55 +02:00
fit-params llama-fit-params: keep explicit --ctx-size 0 (#19070) 2026-01-24 22:13:08 +01:00
gguf-split cli: new CLI experience (#17824) 2025-12-10 15:28:59 +01:00
imatrix common : refactor common_sampler + grammar logic changes (#17937) 2025-12-14 10:11:13 +02:00
llama-bench Setting mmap and direct_io to false as default in llama-bench.cpp (#18841) 2026-01-16 09:46:51 +01:00
mtmd mtmd : Add Nemotron Nano 12B v2 VL support (#19547) 2026-02-14 14:07:00 +01:00
perplexity docs : Minor cleanups (#19252) 2026-02-02 08:38:55 +02:00
quantize llama-quantize : cleanup `--help` output (#19317) 2026-02-08 09:22:38 +02:00
rpc NetBSD build support (#19589) 2026-02-14 09:47:01 +01:00
server build : remove LLAMA_HTTPLIB option (#19623) 2026-02-15 15:38:50 +01:00
tokenize cmake : Do not install tools on iOS targets (#15903) 2025-09-16 09:54:44 +07:00
tts model : fix wavtokenizer embedding notions (#19479) 2026-02-11 07:52:20 +02:00
CMakeLists.txt cmake: only build cli when server is enabled (#18670) 2026-01-09 16:43:26 +01:00