Aleksander Grygier
3120a9fc94
refactor: Cleanup
2026-02-03 14:48:45 +01:00
Aleksander Grygier
16f333e4ec
refactor: Cleanup
2026-02-03 14:42:42 +01:00
Aleksander Grygier
70efc41eb1
refactor: Cleanup
2026-02-03 14:27:39 +01:00
Aleksander Grygier
bb4253ae20
refactor: Cleanup
2026-02-03 14:27:39 +01:00
Aleksander Grygier
4383644951
refactor: Cleanup
2026-02-03 14:01:28 +01:00
Pascal
796fd1a62e
chore: update webui build output
2026-02-03 14:01:09 +01:00
Pascal
ec604a03e1
Update tools/server/webui/src/lib/components/app/chat/ChatAttachments/ChatAttachmentsList.svelte
...
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormActions/ChatFormActionAttachmentsDropdown.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormActions/ChatFormActions.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormActions/ChatFormActions.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormPromptPicker/ChatFormPromptPicker.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormPromptPicker/ChatFormPromptPicker.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/chat/ChatForm/ChatFormPromptPicker/ChatFormPromptPickerArgumentForm.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessages.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageStatistics.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/chat/index.ts
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/chat/index.ts
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/chat/index.ts
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/content/CollapsibleContentBlock.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/content/CollapsibleContentBlock.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/content/MarkdownContent.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/content/MarkdownContent.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/content/MarkdownContent.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/dialogs/DialogMcpResources.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/dialogs/DialogMcpResources.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/dialogs/DialogMcpResources.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/dialogs/DialogMcpResources.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/dialogs/DialogMcpResources.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/dialogs/DialogMcpResources.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/dialogs/DialogMcpResources.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/dialogs/DialogMcpResources.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/dialogs/DialogMcpResources.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/mcp/McpServerCard/McpServerCardDeleteDialog.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/mcp/McpCapabilitiesBadges.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/mcp/McpConnectionLogs.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/mcp/McpResourcePreview.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/mcp/McpResourcePreview.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/mcp/McpResourcePreview.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/mcp/McpServerForm.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/mcp/McpServerSelector.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/mcp/McpServersSettings.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/mcp/McpServersSettings.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/mcp/McpServersSettings.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/mcp/McpServersSettings.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/misc/index.ts
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/misc/TruncatedText.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/misc/TruncatedText.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/components/app/misc/TruncatedText.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/services/mcp.service.ts
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/services/mcp.service.ts
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/services/mcp.service.ts
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/services/mcp.service.ts
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/services/mcp.service.ts
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/services/mcp.service.ts
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/services/mcp.service.ts
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Update tools/server/webui/src/lib/services/mcp.service.ts
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
Load more...
2026-02-03 14:01:05 +01:00
Pascal
65e8bb6df4
chore: update webui build output
2026-02-02 12:02:14 +01:00
Pascal
4fca9bfe16
webui: add early exit for unchanged content in markdown processing
...
Skip redundant processing when coalesced chunks result in identical
content. During rapid streaming, multiple chunks may arrive and coalesce
into pendingMarkdown while processing is ongoing. When the final
coalesced content equals what was just processed, we can skip entirely.
Also clarify the RAF yield comment: the key insight is that chunks
arriving during the yield naturally coalesce, so we always render
the latest state without explicitly tracking what to skip.
2026-02-02 12:00:19 +01:00
Pascal
1d2ff059da
chore: update webui build output
2026-02-02 08:39:14 +01:00
Pascal
4642664c1a
webui: remove artificial cache limit, let GC handle cleanup on conversation change
2026-02-02 08:37:38 +01:00
Pascal
965655fafb
chore: update webui build output
2026-02-01 20:35:35 +01:00
Pascal
7953c18967
webui: fix UI freeze at high token rates with RAF yield
...
The markdown coalescing loop was processing chunks back-to-back without
yielding to the browser's paint cycle. At high token rates (250+ tok/s),
this caused complete UI freeze as the main thread was perpetually busy.
Add a requestAnimationFrame yield between processing batches. This allows
the browser to paint at screen FPS regardless of token throughput. Chunks
arriving during the yield are coalesced and processed together, so we
skip intermediate states and jump straight to the latest content.
Before: Chunk->process->Chunk->process->... (browser never paints = freeze)
After: Chunk->process->[RAF]->coalesced chunks->process->[RAF]->... (screen FPS)
Tested with 250 tok/s streams on 50K+ token contexts: smooth scrolling
and responsive UI throughout.
2026-02-01 20:34:08 +01:00
Pascal
2884ef46b3
chore: update webui build output
2026-02-01 19:45:54 +01:00
Pascal
0dbaeaf6c7
webui: incremental MDAST transform caching for streaming performance
...
Replace full AST re-transformation with per-block caching strategy.
Previously, each streaming chunk triggered processor.run() on the entire
document (12 rehype/remark plugins including KaTeX and highlight.js).
Now transforms individual MDAST nodes and caches results by position hash.
In append-only streaming mode, stable blocks are reused directly from cache,
only the unstable trailing block is re-transformed.
- Add SvelteMap FIFO cache (5000 blocks, evicts oldest 1000 on overflow)
- Add getMdastNodeHash() for MDAST node fingerprinting by position
- Add isAppendMode() to detect streaming append patterns
- Add transformMdastNode() for single-node transformation with cache lookup
- Remove stringifyProcessedNode() (dead code after refactor)
Reduces streaming complexity from O(N × transforms) to O(1) for stable blocks.
Targets 200K token contexts without UI degradation on mobile devices.
2026-02-01 19:44:16 +01:00
Pascal
1ab2e45684
chore: update webui build output
2026-02-01 12:10:06 +01:00
Pascal
82f6094aa2
feat: render images inline below attachment markers in tool results
...
Parse tool results line-by-line to display images immediately after their
[Attachment saved: xxx.png] markers. Fixes previous commit where all images
from all tool calls were shown in every section. Each tool call now displays
only its own images.
Uses Svelte derived for memoization to avoid re-parsing on every streaming
chunk. Parsing only occurs when section.toolResult or message.extra changes
2026-02-01 12:06:25 +01:00
Pascal
be96423ae9
feat: render images below attachment markers in tool results
2026-02-01 04:56:21 +01:00
Pascal
5a4e4f4189
chore: update webui build output
2026-02-01 04:13:48 +01:00
Pascal
42244c0162
fix: also skip image attachments in message history for non-vision backends
2026-02-01 04:13:37 +01:00
Pascal
6b7e6f18a6
chore: update webui build output
2026-02-01 03:22:09 +01:00
Pascal
893dbb058a
fix: skip sending image attachments to non-vision backends
2026-02-01 03:20:36 +01:00
Pascal
556029eee6
chore: update webui build output
2026-01-31 08:27:11 +01:00
Pascal
1384352484
fix: responsive MCP server cards, prioritize server name over version
2026-01-31 08:22:41 +01:00
Pascal
1615b1c58c
fix: responsive MCP server cards for mobile viewports
2026-01-31 07:58:47 +01:00
Pascal
cd8e5741f2
chore: update webui build output
2026-01-30 20:23:45 +01:00
Pascal
b872838329
webui: adaptive model selector dropdown width
...
Make model selector dropdown responsive:
- Mobile: full width (w-full max-w-[100vw])
- Desktop: adapts to longest model name (sm:w-max)
- Replace TruncatedText with responsive span (truncate on mobile, full text on desktop via sm:overflow-visible sm:whitespace-nowrap)
- Center status icons in fixed 24px wrapper to prevent layout shifts
- Add sm:pr-2 padding between text and icon zone on desktop
Fixes dropdown cutting off long model names on desktop while maintaining full-width display on mobile with proper text truncation
2026-01-30 20:21:05 +01:00
Aleksander Grygier
120ada3616
chore: update webui build output
2026-01-29 16:31:07 +01:00
Aleksander Grygier
e41f70bb47
refactor: Use CORS Proxy for favicons calls
2026-01-29 16:30:10 +01:00
Aleksander Grygier
46c5bca942
refactor: Proxy utility
2026-01-29 16:29:04 +01:00
Aleksander Grygier
944765138e
chore: update webui build output
2026-01-29 15:03:00 +01:00
Aleksander Grygier
536c6866e3
feat: Integrate with `llama-server` proxy + improve MCP Server Edit Form
2026-01-29 14:59:28 +01:00
Aleksander Grygier
406cb1dd99
Merge remote-tracking branch 'ngxson/xsn/cors_proxy_demo' into allozaur/mcp-mvp
2026-01-29 13:34:20 +01:00
Aleksander Grygier
9d6e210a5e
Merge remote-tracking branch 'ggml-org/master' into allozaur/mcp-mvp
2026-01-29 13:21:44 +01:00
Aleksander Grygier
7b00b46a6a
chore: update webui build output
2026-01-29 12:55:45 +01:00
Aleksander Grygier
6793c7daac
fix: Checking for capabilities from store
2026-01-29 12:45:10 +01:00
Aleksander Grygier
2aa704b821
refactor: Cleanup
2026-01-29 11:44:08 +01:00
yulo
f3dd7b8e68
HIP: add mmf for CDNA ( #18896 )
...
* refactor mmf rows_per_block
* speed up compile
* pass cdna compile
* fix cuda error
* clean up mmf
* f32 mmf
* clean float mma
* fix mmf error
* faster mmf
* extend tile k
* fix compile error
* Revert "extend tile k"
This reverts commit 4d2ef3d483 .
* fix smem overflow
* speed up compiling mmf
* speed up compile for hip
* 512 block for cdna
* config pad size
* fix as comment
* update select logic
* move some code to cuh
* fix as comment
* correct cdna3 config
---------
Co-authored-by: zhang hui <you@example.com>
2026-01-29 11:10:53 +01:00
Georgi Gerganov
eed25bc6b0
arg : add -kvu to llama-batched-bench ( #19172 )
2026-01-29 08:50:47 +02:00
Vishal Singh
b33df266d0
ggml-zendnn : resolve ZenDNN backend cross-module symbol dependency ( #19159 )
2026-01-29 12:28:57 +08:00
Aman Gupta
3bcc990997
CUDA: refactor topk-moe to enable more models (GLM 4.7, Nemotron etc.) ( #19126 )
2026-01-29 10:31:28 +08:00
Neo Zhang
d4964a7c66
sycl: fix norm kernels: l2_norm, group_norm, rms_norm by remove assert to support more cases ( #19154 )
...
Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>
2026-01-29 09:20:22 +08:00
Sigbjørn Skjæret
50e8962f79
ci : find latest release with asset for winget ( #19161 )
2026-01-28 22:05:39 +01:00
Aleksander Grygier
c7b7fc6c15
chore: update webui build output
2026-01-28 19:57:18 +01:00
Aleksander Grygier
d9e82b7c29
fix: Linter errors
2026-01-28 19:55:44 +01:00
Ruben Ortlam
f6b533d898
Vulkan Flash Attention Coopmat1 Refactor ( #19075 )
...
* vulkan: use coopmat for flash attention p*v matrix multiplication
* fix P loading issue
* fix barrier position
* remove reduction that is no longer needed
* move max thread reduction into loop
* remove osh padding
* add bounds checks and padding
* remove unused code
* fix shmem sizes, loop duration and accesses
* don't overwrite Qf, add new shared psh buffer instead
* add missing bounds checks
* use subgroup reductions
* optimize
* move bounds check, reduce barriers
* support other Bc values and other subgroup sizes
* remove D_split
* replace Of register array with shared memory Ofsh array
* parallelize HSV across the rowgroups
* go back to Of in registers, not shmem
* vectorize sfsh
* don't store entire K tile in shmem
* fixes
* load large k tiles to shmem on Nvidia
* adapt shared memory host check function to shader changes
* remove Bc 32 case
* remove unused variable
* fix missing mask reduction tmspsh barrier
* fix mask bounds check
* fix rowmax f16 under/overflow to inf
* fix flash_attn_cm2 BLOCK_SIZE preprocessor directives
2026-01-28 18:52:45 +01:00
Sascha Rogmann
72d3b1898a
spec : add self‑speculative decoding (no draft model required) + refactor ( #18471 )
...
* server: introduce self-speculative decoding
* server: moved self-call into speculative.cpp
* can_speculate() includes self-speculation
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* server: can_speculate() tests self-spec
* server: replace can_speculate() with slot.can_speculate()
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* common: use %zu format specifier for size_t in logging
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* server: can_speculate() requires a task instance
* common: ngram map, config self-speculative decoding
* common: add enum common_speculative_type
* common: add vector of speculative states
* common: add option --spec-draftless
* server: cleanup (remove slot.batch_spec, rename)
* common: moved self-spec impl to ngram-map
* common: cleanup (use common_speculative_state_draft)
* spec : refactor
* cont : naming
* spec: remove --spec-config
* doc: (draftless) speculative decoding
* common: print performance in spec decoding
* minor : cleanup
* common : better names
* minor : cleanup + fix build
* minor: comments
* CODEOWNERS: add common/ngram-map.* (#18471 )
* common : rename speculative.draftless_type -> speculative.type
* ngram-map : fix uninitialized values
* ngram-map : take into account the input can become shorter
* ngram-map : revert len check for now
* arg : change `--spec-draftless` -> `--spec-type`
* spec : add common_speculative_state::accept()
* spec : refactor + add common_speculative_begin()
* spec : fix begin() call with mtmd
* spec : additional refactor + remove common_speculative_params
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-01-28 19:42:42 +02:00
Aleksander Grygier
7c9be63a74
refactor: Refine Chat Message Processing State Display
2026-01-28 18:31:37 +01:00
Aleksander Grygier
5a176d1893
feat: Chat logic improvements
2026-01-28 18:31:37 +01:00
Aleksander Grygier
aa7089d598
feat: Integrate Resource Attachments into Chat Form UI
2026-01-28 18:31:37 +01:00