llama.cpp

History

Pascal 1c9085c9a6 webui: fix UI freeze at high token rates with RAF yield The markdown coalescing loop was processing chunks back-to-back without yielding to the browser's paint cycle. At high token rates (250+ tok/s), this caused complete UI freeze as the main thread was perpetually busy. Add a requestAnimationFrame yield between processing batches. This allows the browser to paint at screen FPS regardless of token throughput. Chunks arriving during the yield are coalesced and processed together, so we skip intermediate states and jump straight to the latest content. Before: Chunk->process->Chunk->process->... (browser never paints = freeze) After: Chunk->process->[RAF]->coalesced chunks->process->[RAF]->... (screen FPS) Tested with 250 tok/s streams on 50K+ token contexts: smooth scrolling and responsive UI throughout.		2026-02-13 14:00:06 +01:00
..
batched-bench	tool/ex/tests: consistently free ctx, then model (#18168 )	2025-12-22 11:00:37 +01:00
cli	support --verbose-prompt (#19576 )	2026-02-13 12:49:10 +01:00
completion	completion : simplify batch (embd) processing (#19286 )	2026-02-04 05:43:28 +01:00
cvector-generator	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
export-lora	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
fit-params	llama-fit-params: keep explicit --ctx-size 0 (#19070 )	2026-01-24 22:13:08 +01:00
gguf-split	cli: new CLI experience (#17824 )	2025-12-10 15:28:59 +01:00
imatrix	common : refactor common_sampler + grammar logic changes (#17937 )	2025-12-14 10:11:13 +02:00
llama-bench	Setting mmap and direct_io to false as default in llama-bench.cpp (#18841 )	2026-01-16 09:46:51 +01:00
mtmd	model: Add Kimi-K2.5 support (#19170 )	2026-02-11 16:47:30 +01:00
perplexity	docs : Minor cleanups (#19252 )	2026-02-02 08:38:55 +02:00
quantize	llama-quantize : cleanup `--help` output (#19317 )	2026-02-08 09:22:38 +02:00
rpc	rpc : update from common.cpp (#19400 )	2026-02-08 09:06:45 +01:00
server	webui: fix UI freeze at high token rates with RAF yield	2026-02-13 14:00:06 +01:00
tokenize	cmake : Do not install tools on iOS targets (#15903 )	2025-09-16 09:54:44 +07:00
tts	model : fix wavtokenizer embedding notions (#19479 )	2026-02-11 07:52:20 +02:00
CMakeLists.txt	cmake: only build cli when server is enabled (#18670 )	2026-01-09 16:43:26 +01:00