llama.cpp

History

Daniel Bevenius d9a23126bf common : extract replay_last_token to common.h This commit extracts the replay_last_token function from save-load-state.cpp to common.h. The motivation for this is to allow reuse of the function but also to clarify the intent of code that replays the last token after loading the session state.		2026-02-16 08:43:29 +01:00
..
batched-bench	tool/ex/tests: consistently free ctx, then model (#18168 )	2025-12-22 11:00:37 +01:00
cli	support --verbose-prompt (#19576 )	2026-02-13 12:49:10 +01:00
completion	common : extract replay_last_token to common.h	2026-02-16 08:43:29 +01:00
cvector-generator	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
export-lora	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
fit-params	llama-fit-params: keep explicit --ctx-size 0 (#19070 )	2026-01-24 22:13:08 +01:00
gguf-split	cli: new CLI experience (#17824 )	2025-12-10 15:28:59 +01:00
imatrix	common : refactor common_sampler + grammar logic changes (#17937 )	2025-12-14 10:11:13 +02:00
llama-bench	Setting mmap and direct_io to false as default in llama-bench.cpp (#18841 )	2026-01-16 09:46:51 +01:00
mtmd	mtmd : Add Nemotron Nano 12B v2 VL support (#19547 )	2026-02-14 14:07:00 +01:00
perplexity	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
quantize	llama-quantize : cleanup `--help` output (#19317 )	2026-02-08 09:22:38 +02:00
rpc	NetBSD build support (#19589 )	2026-02-14 09:47:01 +01:00
server	build : remove LLAMA_HTTPLIB option (#19623 )	2026-02-15 15:38:50 +01:00
tokenize	cmake : Do not install tools on iOS targets (#15903 )	2025-09-16 09:54:44 +07:00
tts	model : fix wavtokenizer embedding notions (#19479 )	2026-02-11 07:52:20 +02:00
CMakeLists.txt	cmake: only build cli when server is enabled (#18670 )	2026-01-09 16:43:26 +01:00