llama.cpp/tools
Pascal 0dbaeaf6c7 webui: incremental MDAST transform caching for streaming performance
Replace full AST re-transformation with per-block caching strategy.
Previously, each streaming chunk triggered processor.run() on the entire
document (12 rehype/remark plugins including KaTeX and highlight.js).

Now transforms individual MDAST nodes and caches results by position hash.
In append-only streaming mode, stable blocks are reused directly from cache,
only the unstable trailing block is re-transformed.

- Add SvelteMap FIFO cache (5000 blocks, evicts oldest 1000 on overflow)
- Add getMdastNodeHash() for MDAST node fingerprinting by position
- Add isAppendMode() to detect streaming append patterns
- Add transformMdastNode() for single-node transformation with cache lookup
- Remove stringifyProcessedNode() (dead code after refactor)

Reduces streaming complexity from O(N × transforms) to O(1) for stable blocks.
Targets 200K token contexts without UI degradation on mobile devices.
2026-02-01 19:44:16 +01:00
..
batched-bench tool/ex/tests: consistently free ctx, then model (#18168) 2025-12-22 11:00:37 +01:00
cli common : use two decimal places for float arg help messages (#19048) 2026-01-25 07:31:42 +01:00
completion completion : fix prompt cache for recurrent models (#19045) 2026-01-25 09:12:50 +02:00
cvector-generator common : refactor common_sampler + grammar logic changes (#17937) 2025-12-14 10:11:13 +02:00
export-lora cmake : Do not install tools on iOS targets (#15903) 2025-09-16 09:54:44 +07:00
fit-params llama-fit-params: keep explicit --ctx-size 0 (#19070) 2026-01-24 22:13:08 +01:00
gguf-split cli: new CLI experience (#17824) 2025-12-10 15:28:59 +01:00
imatrix common : refactor common_sampler + grammar logic changes (#17937) 2025-12-14 10:11:13 +02:00
llama-bench Setting mmap and direct_io to false as default in llama-bench.cpp (#18841) 2026-01-16 09:46:51 +01:00
mtmd mtmd : update docs to use llama_model_n_embd_inp (#18999) 2026-01-22 14:36:32 +01:00
perplexity common : refactor common_sampler + grammar logic changes (#17937) 2025-12-14 10:11:13 +02:00
quantize quantize: prevent input/output file collision (#18451) 2025-12-31 23:29:03 +08:00
rpc Install rpc-server when GGML_RPC is ON. (#17149) 2025-11-11 10:53:59 +00:00
server webui: incremental MDAST transform caching for streaming performance 2026-02-01 19:44:16 +01:00
tokenize cmake : Do not install tools on iOS targets (#15903) 2025-09-16 09:54:44 +07:00
tts refactor : remove libcurl, use OpenSSL when available (#18828) 2026-01-14 18:02:47 +01:00
CMakeLists.txt cmake: only build cli when server is enabled (#18670) 2026-01-09 16:43:26 +01:00