The markdown coalescing loop was processing chunks back-to-back without
yielding to the browser's paint cycle. At high token rates (250+ tok/s),
this caused complete UI freeze as the main thread was perpetually busy.
Add a requestAnimationFrame yield between processing batches. This allows
the browser to paint at screen FPS regardless of token throughput. Chunks
arriving during the yield are coalesced and processed together, so we
skip intermediate states and jump straight to the latest content.
Before: Chunk->process->Chunk->process->... (browser never paints = freeze)
After: Chunk->process->[RAF]->coalesced chunks->process->[RAF]->... (screen FPS)
Tested with 250 tok/s streams on 50K+ token contexts: smooth scrolling
and responsive UI throughout.
Replace full AST re-transformation with per-block caching strategy.
Previously, each streaming chunk triggered processor.run() on the entire
document (12 rehype/remark plugins including KaTeX and highlight.js).
Now transforms individual MDAST nodes and caches results by position hash.
In append-only streaming mode, stable blocks are reused directly from cache,
only the unstable trailing block is re-transformed.
- Add SvelteMap FIFO cache (5000 blocks, evicts oldest 1000 on overflow)
- Add getMdastNodeHash() for MDAST node fingerprinting by position
- Add isAppendMode() to detect streaming append patterns
- Add transformMdastNode() for single-node transformation with cache lookup
- Remove stringifyProcessedNode() (dead code after refactor)
Reduces streaming complexity from O(N × transforms) to O(1) for stable blocks.
Targets 200K token contexts without UI degradation on mobile devices.