llama.cpp/tools
Daniel Bevenius 44bddc0a89
completion : add replying of session state
This commit updates the session handing in the completion tool to handle
the that logits are no longer stored in the session file. Instead, we
need to replay the last token to get the logits for sampling.
2026-02-06 12:46:55 +01:00
..
batched-bench tool/ex/tests: consistently free ctx, then model (#18168) 2025-12-22 11:00:37 +01:00
cli common : use two decimal places for float arg help messages (#19048) 2026-01-25 07:31:42 +01:00
completion completion : add replying of session state 2026-02-06 12:46:55 +01:00
cvector-generator docs : Minor cleanups (#19252) 2026-02-02 08:38:55 +02:00
export-lora docs : Minor cleanups (#19252) 2026-02-02 08:38:55 +02:00
fit-params llama-fit-params: keep explicit --ctx-size 0 (#19070) 2026-01-24 22:13:08 +01:00
gguf-split cli: new CLI experience (#17824) 2025-12-10 15:28:59 +01:00
imatrix common : refactor common_sampler + grammar logic changes (#17937) 2025-12-14 10:11:13 +02:00
llama-bench Setting mmap and direct_io to false as default in llama-bench.cpp (#18841) 2026-01-16 09:46:51 +01:00
mtmd mtmd: add min/max pixels gguf metadata (#19273) 2026-02-02 20:59:06 +01:00
perplexity docs : Minor cleanups (#19252) 2026-02-02 08:38:55 +02:00
quantize quantize: add option --tensor-type-file to llama-quantize (#18572) 2026-01-31 11:39:21 +08:00
rpc Install rpc-server when GGML_RPC is ON. (#17149) 2025-11-11 10:53:59 +00:00
server server: print actual model name in 'model not found" error (#19117) 2026-02-02 16:55:27 +01:00
tokenize cmake : Do not install tools on iOS targets (#15903) 2025-09-16 09:54:44 +07:00
tts refactor : remove libcurl, use OpenSSL when available (#18828) 2026-01-14 18:02:47 +01:00
CMakeLists.txt cmake: only build cli when server is enabled (#18670) 2026-01-09 16:43:26 +01:00