Commit Graph

454 Commits

Author SHA1 Message Date
Imad Saddik 0cd953eea2 chore: update webui build output 2026-03-15 17:13:14 +00:00
Imad Saddik 6074619ba4 style: restore class for checkbox labels 2026-03-15 17:11:58 +00:00
Imad Saddik 297abf8450 chore: update webui build output 2026-03-15 17:09:59 +00:00
Imad Saddik d4034eff07 fix: update chatWidthClasses to use autoChatWidth configuration 2026-03-15 17:08:43 +00:00
Imad Saddik b73209694d chore: update webui build output 2026-03-15 17:03:58 +00:00
Imad Saddik 2630c27754 refactor: simplify chatWidthClasses getter logic and remove widthClasses.class 2026-03-15 17:02:41 +00:00
Imad Saddik 1a6f21f25c chore: revert package-lock.json to match master 2026-03-15 16:56:41 +00:00
Imad Saddik 95be04617e chore: update webui build output 2026-03-15 16:56:08 +00:00
Imad Saddik 2836834801 refactor: remove anything related to the custom chat width setting 2026-03-15 16:54:43 +00:00
Imad Saddik 20a8227933 chore: update webui build output 2026-03-15 16:44:35 +00:00
Imad Saddik 89647d5daf chore: downgrade @lucide/svelte version and remove custom chat width component 2026-03-15 16:43:18 +00:00
Imad Saddik c55533a706 chore: update webui build output 2026-03-14 09:00:23 +00:00
Imad Saddik 29ede762c4 refactor: don't reset custom chat width when the auto width is checked 2026-03-14 08:59:02 +00:00
Imad Saddik e8eccf9b35 feat: update chatWidthClasses to prioritize auto chat width 2026-03-14 08:57:25 +00:00
Imad Saddik 23758f3ba8 feat: add syncable parameters for auto and custom chat width 2026-03-14 08:55:56 +00:00
Imad Saddik b7851305df chore: update webui build output 2026-03-14 08:29:24 +00:00
Imad Saddik e2a6be14e7 fix: pass style to ChatMessageUser 2026-03-14 08:09:33 +00:00
Imad Saddik bcc95c98cb refactor: remove chatWidthClasses from ChatForm 2026-03-14 08:05:20 +00:00
Imad Saddik 16fcb29197 fix: use widthClasses in ChatScreenForm 2026-03-14 08:03:48 +00:00
Imad Saddik 5a721b5678 chore: update webui build output 2026-03-14 06:57:42 +00:00
Imad Saddik ee944af476 style: fix indentation and formatting in ChatForm.svelte 2026-03-14 06:54:26 +00:00
Imad Saddik 8f9571a5c2 refactor: use derived on chatWidthClasses for consistency 2026-03-14 06:53:13 +00:00
Imad Saddik 0306577300 chore: update webui build output 2026-03-14 06:48:49 +00:00
Imad Saddik c3cb3fcfcd refactor: remove unused chat width parameters from syncable parameters 2026-03-14 06:47:28 +00:00
Imad Saddik 19986697e3 chore: update webui build output 2026-03-14 06:46:26 +00:00
Imad Saddik 8cef196854 style: fix formatting 2026-03-14 06:44:43 +00:00
Imad Saddik 4561f25021 refactor: call chatWidthClasses once and reuse it everywhere 2026-03-14 06:43:19 +00:00
Imad Saddik 165234e722 fix: correct typo in disabled message for automatic width 2026-03-14 06:37:06 +00:00
Imad Saddik b9545a1021 chore: update webui build output 2026-03-14 06:33:38 +00:00
Imad Saddik 5dd9b7d888 Merge branch 'master' into feat/change_chat_screen_width 2026-03-14 06:32:15 +00:00
ZeroV0LT f17b3be63f
llama : fix pooling assertion crash in chunked GDN detection path (#20468)
* llama : fix pooling assertion crash in chunked GDN detection path

The chunked fused Gated Delta Net detection in sched_reserve() calls
graph_reserve(16*n_seqs, n_seqs, n_outputs, ...) where n_outputs = n_seqs.
This creates a dimension mismatch in build_pooling() for embedding models
with mean/rank pooling: build_inp_mean() creates a tensor with shape
[n_tokens=16*n_seqs, ...] while t_embd is reduced to [n_outputs=n_seqs, ...]
via out_ids, causing ggml_mul_mat to assert on ggml_can_mul_mat(a, b).

Fix: pass n_tokens as n_outputs in the chunked GDN graph reservation,
matching the pattern used by the pp/tg worst-case reservations.

Regression introduced by #20340 (d28961d).
Same class of bug as #12517, fixed by #12545.

* server : add mean pooling tests to embedding test suite

Add test_embedding_pooling_mean and test_embedding_pooling_mean_multiple
to cover the --pooling mean codepath, which was previously untested.

These tests would have caught the regression introduced by #20340 where
build_pooling() crashes with a ggml_mul_mat assertion due to mismatched
dimensions in the chunked GDN detection path.

---------

Co-authored-by: Domenico Crupi <domenico@zerovolt.it>
2026-03-13 20:53:42 +02:00
SoftwareRenderer d7ba99c485
server: reset counter related to kill-switch on client error (#20513)
* server: reset kill-switch on client error

This avoids triggering a server kill switch.

If the client sends a request that exceeds the configured context size, an appropriate HTTP 400 response is provided and no tokens are generated.

However since no tokens are generated, update_slots() increments n_empty_consecutive. If the client sends 3 such messages in a row, the server terminates.

* moved counter reset as per recommendation

* cont : minor

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2026-03-13 19:58:09 +02:00
Piotr Wilkin (ilintar) 0e810413bb
tests : use `reasoning` instead of `reasoning_budget` in server tests (#20432) 2026-03-12 13:41:01 +01:00
Pascal de190154c8
New conversations now auto-select the first loaded model (#20403)
* webui: auto-select first loaded model for new conversations in router mode

* chore: update webui build output
2026-03-12 09:07:05 +01:00
Piotr Wilkin (ilintar) acb7c79069
common/parser: handle reasoning budget (#20297)
* v1

* Finished!

* Handlie cli

* Reasoning sampler

* Apply suggestions from code review

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Less explosive terminology :)

* Add utf-8 case and tests

* common : migrate reasoning budget sampler to common

* cont : clean up

* cont : expose state and allow passing as initial state

* cont : remove unused imports

* cont : update state machine doc string

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Alde Rojas <hello@alde.dev>
2026-03-11 10:26:12 +01:00
Pascal 00de615345
Fix agentic mcp image single model (#20339)
* webui: fix MCP image attachments dropped during the agentic loop in single-model mode

* chore: update webui build output
2026-03-11 05:31:33 +01:00
Georgi Gerganov a7b3dee7a5
server : make 2 checkpoints near the end of the prompt (#20288)
* server : make 2 checkpoints near the end of the prompt

* cont : adjust checkpoints
2026-03-10 14:28:23 +02:00
Evan Huus 23fbfcb1ad
server: Parse port numbers from MCP server URLs in CORS proxy (#20208)
* Parse port numbers from MCP server URLs

* Pass scheme to http proxy for determining whether to use SSL

* Fix download on non-standard port and re-add port to logging

* add test

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2026-03-09 17:47:54 +01:00
Georgi Gerganov 96cfc4992c
server : fix checkpoints n_tokens calculation (#20287) 2026-03-09 16:47:06 +02:00
Georgi Gerganov 344ee2a38a
server : warn swa-full is not supported for non-SWA models (#20291) 2026-03-09 16:44:25 +02:00
Georgi Gerganov d6e1556499
server : fix off-by-1 in server_tokens::size_up_to_pos() (#20279)
* server : fix off-by-1 in server_tokens::size_up_to_pos()

* cont : fix typo [no ci]
2026-03-09 16:43:38 +02:00
Georgi Gerganov 107d599952
server : add kill switch when server is stuck (#20277) 2026-03-09 10:33:12 +02:00
Georgi Gerganov d417bc43dd
server : do not create checkpoints right after mtmd chunks (#20232) 2026-03-08 22:16:46 +02:00
decahedron1 ff52ee964d
server : correct index on finish in OAI completion streams (#20226) 2026-03-08 10:08:57 +01:00
Piotr Wilkin (ilintar) 566059a26b
Autoparser - complete refactoring of parser architecture (#18675)
* Autoparser - full single commit squish

* Final pre-merge changes: minor fixes, Kimi 2.5 model parser
2026-03-06 21:01:00 +01:00
Tom Vaucourt e68f2fb894
server : preserve anthropic thinking blocks in conversion (#20120)
* server : preserve anthropic thinking blocks in conversion (#20090)

* server : add tests for anthropic thinking block conversion

---------

Co-authored-by: root <root@llamacpp.home>
2026-03-06 17:41:12 +01:00
Piotr Wilkin (ilintar) f5ddcd1696
Checkpoint every n tokens: squash (#20087) 2026-03-06 11:39:26 +01:00
Aleksander Grygier f6235a41ef
webui: Agentic Loop + MCP Client with support for Tools, Resources and Prompts (#18655) 2026-03-06 10:00:39 +01:00
Aleksander Grygier 5e335ba113
webui: Improvements for Models Selector UI (#20066) 2026-03-05 08:52:22 +01:00
Marcel Petrick 92f7da00b4
chore : correct typos [no ci] (#20041)
* fix(docs): correct typos found during code review

Non-functional changes only:
- Fixed minor spelling mistakes in comments
- Corrected typos in user-facing strings
- No variables, logic, or functional code was modified.

Signed-off-by: Marcel Petrick <mail@marcelpetrick.it>

* Update docs/backend/CANN.md

Co-authored-by: Aaron Teo <taronaeo@gmail.com>

* Revert "Auxiliary commit to revert individual files from 846d1c301281178efbc6ce6060ad34c1ebe45af8"

This reverts commit 02fcf0c7db661d5ff3eff96b2b2db9fdb7213256.

* Update tests/test-backend-ops.cpp

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update tests/test-backend-ops.cpp

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

---------

Signed-off-by: Marcel Petrick <mail@marcelpetrick.it>
Co-authored-by: Aaron Teo <taronaeo@gmail.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-03-05 08:50:21 +01:00