llama.cpp

History

Pascal e7140051b7 webui: incremental MDAST transform caching for streaming performance Replace full AST re-transformation with per-block caching strategy. Previously, each streaming chunk triggered processor.run() on the entire document (12 rehype/remark plugins including KaTeX and highlight.js). Now transforms individual MDAST nodes and caches results by position hash. In append-only streaming mode, stable blocks are reused directly from cache, only the unstable trailing block is re-transformed. - Add SvelteMap FIFO cache (5000 blocks, evicts oldest 1000 on overflow) - Add getMdastNodeHash() for MDAST node fingerprinting by position - Add isAppendMode() to detect streaming append patterns - Add transformMdastNode() for single-node transformation with cache lookup - Remove stringifyProcessedNode() (dead code after refactor) Reduces streaming complexity from O(N × transforms) to O(1) for stable blocks. Targets 200K token contexts without UI degradation on mobile devices.		2026-02-13 14:00:06 +01:00
..
batched-bench	tool/ex/tests: consistently free ctx, then model (#18168 )	2025-12-22 11:00:37 +01:00
cli	support --verbose-prompt (#19576 )	2026-02-13 12:49:10 +01:00
completion	completion : simplify batch (embd) processing (#19286 )	2026-02-04 05:43:28 +01:00
cvector-generator	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
export-lora	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
fit-params	llama-fit-params: keep explicit --ctx-size 0 (#19070 )	2026-01-24 22:13:08 +01:00
gguf-split	cli: new CLI experience (#17824 )	2025-12-10 15:28:59 +01:00
imatrix	common : refactor common_sampler + grammar logic changes (#17937 )	2025-12-14 10:11:13 +02:00
llama-bench	Setting mmap and direct_io to false as default in llama-bench.cpp (#18841 )	2026-01-16 09:46:51 +01:00
mtmd	model: Add Kimi-K2.5 support (#19170 )	2026-02-11 16:47:30 +01:00
perplexity	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
quantize	llama-quantize : cleanup `--help` output (#19317 )	2026-02-08 09:22:38 +02:00
rpc	rpc : update from common.cpp (#19400 )	2026-02-08 09:06:45 +01:00
server	webui: incremental MDAST transform caching for streaming performance	2026-02-13 14:00:06 +01:00
tokenize	cmake : Do not install tools on iOS targets (#15903 )	2025-09-16 09:54:44 +07:00
tts	model : fix wavtokenizer embedding notions (#19479 )	2026-02-11 07:52:20 +02:00
CMakeLists.txt	cmake: only build cli when server is enabled (#18670 )	2026-01-09 16:43:26 +01:00