Commit Graph

563 Commits

Author SHA1 Message Date
Aleksander Grygier 72e5d9ae2a chore: update webui build output 2026-02-13 13:35:58 +01:00
Aleksander Grygier dd1fe96e18 feat: Improve formatting performance time 2026-02-13 13:35:58 +01:00
Aleksander Grygier eed0c5ae48 fix: System prompt sorting 2026-02-13 13:35:58 +01:00
Aleksander Grygier 16aa6fae0a fix: Save draft message in Chat Form when adding System Prompt from new chat view 2026-02-13 13:33:06 +01:00
Aleksander Grygier 0fe25847ff fix: Chat Form submission 2026-02-13 13:33:06 +01:00
Aleksander Grygier ed70cb577d chore: update webui build output 2026-02-13 13:33:05 +01:00
Aleksander Grygier 141540ccbb feat: MCP Prompts WIP 2026-02-13 13:33:05 +01:00
Aleksander Grygier 46ced87178 chore: update webui build output 2026-02-13 13:32:47 +01:00
Aleksander Grygier 43da6b8676 feat: UI improvements 2026-02-13 13:32:47 +01:00
Aleksander Grygier 17b326b32a chore: update webui build output 2026-02-13 13:30:16 +01:00
Aleksander Grygier aaeea933b7 feat: Architectural improvements 2026-02-13 13:30:16 +01:00
Aleksander Grygier da252e3425 feat: Per-conversation agentic loop state 2026-02-13 13:28:24 +01:00
Aleksander Grygier 1565cda1ff chore: update webui build output 2026-02-13 13:28:24 +01:00
Aleksander Grygier a8c2e66e92 feat: Improve MCP Server selection UI + lazy load health checks 2026-02-13 13:28:24 +01:00
Aleksander Grygier f8d6d16df1 feat: UI improvements 2026-02-13 13:21:35 +01:00
Aleksander Grygier 690dd09b5f feat: Simplify MCP server enabling logic per chat
Refactors MCP server enabling logic to remove the dependency on global settings.

This simplifies the logic by directly checking the per-chat override status, and removes the need to pass the global enabled state as a parameter.

Additionally:
- Only shows MCP servers that are enabled in settings in the selector.
- Sorts the servers by whether they are enabled for the current chat.
2026-02-13 13:21:35 +01:00
Aleksander Grygier a12304cdea chore: update webui build output 2026-02-13 13:21:35 +01:00
Aleksander Grygier 52f21b4ca4 fix: Missing onModelChange callback running assistant message re-generation 2026-02-13 13:21:35 +01:00
Pascal 20e5e70c61 chore: update webui build output 2026-02-13 13:21:35 +01:00
Pascal a2cce59d69 fix: acurate tool_response display 2026-02-13 13:21:35 +01:00
Pascal fdd67f45e6 fix: unify MCP server label logic with simplified fallback 2026-02-13 13:21:35 +01:00
Pascal bdd9bcfb75 chore: update webui build output 2026-02-13 13:21:35 +01:00
Pascal a515179730 refactor: remove multimodal validation from model selector
Remove all frontend validation logic that prevented users from selecting
models based on multimodal capabilities. This refactoring removes
restrictive UI code while maintaining full functionality

- Vision models can describe images as text
- That text remains useful for non-vision models
- Chaining vision -> non-vision is a valid workflow
- Users know their use case better than the UI
- Users can return to vision models when needed
2026-02-13 13:21:35 +01:00
Pascal c7e76c65d1 chore: update webui build output 2026-02-13 13:21:35 +01:00
Pascal 37c084873c fix: ignore assistant attachments (MCP) for modality detection 2026-02-13 13:21:35 +01:00
Pascal d09cdfaf0a chore: update webui build output 2026-02-13 13:21:35 +01:00
Pascal 6d41f74031 refactor: eliminate MCP circular dependency
- Change architecture from mcpStore <-> mcpClient to mcpClient -> mcpStore
- Remove bidirectional callback pattern (set*Callback, notify* methods)
- Add updateState/updateHealthCheck public methods in mcpStore
- Replace callback calls with direct mcpStore method calls
- Remove unused imports (browser, HealthCheckState) and constructor
- Fixes CI: ReferenceError Cannot access mcpClient before initialization
2026-02-13 13:21:35 +01:00
Pascal 07ae189175 chore: update webui build output 2026-02-13 13:21:34 +01:00
Pascal 23741b3c6a fix: strip reasoning content and UI proprietary tags from prompts
TODO: add toggle and ensure backend API compliance for reasoning format
2026-02-13 13:21:34 +01:00
Pascal b5b527fa52 chore: update webui build output 2026-02-13 13:21:34 +01:00
Pascal fb1ec29898 refactor: remove reasoning after first turn filter 2026-02-13 13:21:34 +01:00
Pascal fc5d9f587f refactor: inline reasoning with tags, remove fixed thinking field 2026-02-13 13:21:34 +01:00
Pascal 6b3bc23fc2 chore: update webui build output 2026-02-13 13:21:34 +01:00
Pascal c73baed7e3 feat: resolve MCP attachment images via rehype plugin
LLM can reference tool-generated images using markdown links like,
plugin resolves attachment names to base64 from message.extra when present,
regular HTTP/data URLs pass through unchanged (no regression)

- rehypeResolveAttachmentImages plugin in markdown pipeline
- Pass message prop to MarkdownContent and AgenticContent
- Force processor reactivity on message.extra changes
- Filter assistant images from API context (display-only)
2026-02-13 13:21:34 +01:00
Pascal 09381a59fd feat: persist base64 attachments from tool results 2026-02-13 13:21:34 +01:00
Pascal f16457551e webui: fix custom headers persistence in UI (derived) 2026-02-13 13:21:34 +01:00
Pascal f42e5f114e webui: fix custom headers persistence in UI 2026-02-13 13:21:34 +01:00
Aleksander Grygier 162bd976ed fix: Word wrapping 2026-02-13 13:21:34 +01:00
Aleksander Grygier c2dd1d2fed chore: update webui build output 2026-02-13 13:21:34 +01:00
Aleksander Grygier 008463149b feat: UI improvements 2026-02-13 13:21:34 +01:00
Aleksander Grygier 1dba2ec4a9 chore: update webui build output 2026-02-13 13:21:34 +01:00
Aleksander Grygier 805c171825 feat: UI improvement 2026-02-13 13:21:34 +01:00
Aleksander Grygier d6455a7530 chore: update webui build output 2026-02-13 13:21:34 +01:00
Aleksander Grygier bb4bd7fe09 chore: update webui build output 2026-02-13 13:21:34 +01:00
Aleksander Grygier 05dfb5e70c chore: update webui build output 2026-02-13 13:21:34 +01:00
Aleksander Grygier cad9ca1208 feat: MCP Server Details 2026-02-13 13:21:34 +01:00
Aleksander Grygier 0e980bf881 chore: update webui build output 2026-02-13 13:21:34 +01:00
Aleksander Grygier 825d2ea9a9 feat: MCP connection details WIP 2026-02-13 13:21:34 +01:00
Aleksander Grygier 2b37f70c37 refactor: MCP types and health check 2026-02-13 13:21:34 +01:00
Aleksander Grygier 36a37d1794 chore: update webui build output 2026-02-13 13:21:34 +01:00
Aleksander Grygier 38ba6d8372 refactor: KeyValuePairs component 2026-02-13 13:21:34 +01:00
Aleksander Grygier c5465d4893 chore: update webui build output 2026-02-13 13:21:34 +01:00
Aleksander Grygier 57089370e4 refactor: DRY 2026-02-13 13:21:34 +01:00
Aleksander Grygier f80d5f615e chore: update webui build output 2026-02-13 13:21:34 +01:00
Aleksander Grygier e1da51335c refactor: Architecture improvements 2026-02-13 13:21:34 +01:00
Aleksander Grygier 3bc8d93546 chore: update webui build output 2026-02-13 13:21:34 +01:00
Aleksander Grygier 48b2b1b2f0 refactor: MCP state management + stores/clients relationship 2026-02-13 13:21:34 +01:00
Aleksander Grygier 2cd682178b chore: update webui build output 2026-02-13 13:21:34 +01:00
Aleksander Grygier da8baaa9b8 fix: Distinguish streaming vs incomplete tool calls in UI 2026-02-13 13:21:34 +01:00
Aleksander Grygier 3179858e5f chore: update webui build output 2026-02-13 13:21:34 +01:00
Aleksander Grygier 9471729162 fix: Restore live reactive UI progress for tool calls 2026-02-13 13:21:34 +01:00
Aleksander Grygier 64923b20be chore: update webui build output 2026-02-13 13:21:34 +01:00
Pascal 179477b4ed fix: reset tool call state between turns 2026-02-13 13:21:34 +01:00
Pascal 38244a1bfa webui: enable streaming of tool call arguments 2026-02-13 13:21:34 +01:00
Aleksander Grygier 2faf237d01 chore: update webui build output 2026-02-13 13:21:34 +01:00
Aleksander Grygier 5ffb6aba3a refactor: Cleanup 2026-02-13 13:21:34 +01:00
Pascal 96e51e2a41 webui: prevent mobile dropdown immediate close on synthetic click 2026-02-13 13:20:42 +01:00
Pascal 8916698294 webui: fix redirect to root ignoring base path 2026-02-13 13:20:42 +01:00
Aleksander Grygier 2a33fc2059 refactor: Cleanup 2026-02-13 13:20:41 +01:00
Aleksander Grygier 04913f20d9 chore: update webui build output 2026-02-13 13:20:41 +01:00
Aleksander Grygier 939e7aa16b refactor: Types 2026-02-13 13:20:41 +01:00
Aleksander Grygier bef865d871 refactor: Componentize McpServerCard 2026-02-13 13:20:41 +01:00
Aleksander Grygier 7dbb05a160 refactor: Cleanup 2026-02-13 13:20:41 +01:00
Aleksander Grygier 7e194f653a fix: Remove redundant CSS class 2026-02-13 13:20:41 +01:00
Aleksander Grygier 02c87fa3c9 feat: Add TruncatedText component 2026-02-13 13:20:41 +01:00
Aleksander Grygier 27b80ae3e8 fix: Collapsible box trigger 2026-02-13 13:20:26 +01:00
Aleksander Grygier 408e098324 refactor: Cleanup 2026-02-13 13:20:26 +01:00
Aleksander Grygier 0b36d04c38 refactor: Cleanup 2026-02-13 13:20:07 +01:00
Aleksander Grygier df464c1f5a refactor: Collapsible Content Block & small fixes 2026-02-13 13:18:20 +01:00
Aleksander Grygier 26044454ef chore: update webui build output 2026-02-13 13:18:20 +01:00
Aleksander Grygier f0ac6fa039 refactor: Cleanup 2026-02-13 13:18:20 +01:00
Aleksander Grygier 7c9ba36216 chore: update webui build output 2026-02-13 13:18:20 +01:00
Aleksander Grygier 7ab269cd77 feat: UI improvements 2026-02-13 13:18:20 +01:00
Aleksander Grygier e0122465ed feat: Always show Mcp Selector 2026-02-13 13:18:20 +01:00
Pascal 36c9ad9303 fix: remove double scrollbar in model selector by using Bits UI content available height 2026-02-13 13:18:20 +01:00
Aleksander Grygier bc60beb1a7 feat: Enable adding System Prompt per-chat 2026-02-13 13:18:20 +01:00
Aleksander Grygier 276a3e9416 fix: UI 2026-02-13 13:17:51 +01:00
Aleksander Grygier c74065de75 chore: update webui build output 2026-02-13 13:17:51 +01:00
Aleksander Grygier e6ad864984 feat: UI improvements 2026-02-13 13:17:51 +01:00
Pascal cff237cb3e webui: raw tool result display, strip only leading/trailing newlines to preserve indentation 2026-02-13 13:17:33 +01:00
Pascal afb79b2970 webui: split raw output into backend parsing and frontend display options 2026-02-13 13:17:33 +01:00
Pascal 18efdabb12 webui: remove legacy wrapper and restore WebSocket transport 2026-02-13 13:17:33 +01:00
Pascal a13782a4d1 webui: remove unused imports 2026-02-13 13:17:33 +01:00
Aleksander Grygier d548bf27dd chore: update webui build output 2026-02-13 13:17:33 +01:00
Aleksander Grygier bdd5958f6d feat: Improve agentic tool call streaming display with 'in progress' state 2026-02-13 13:17:32 +01:00
Aleksander Grygier a9c2ea7a8e feat: Enhance MCP server dropdown with search, popularity sorting, and per-chat overrides 2026-02-13 13:17:32 +01:00
Aleksander Grygier dfce09b34b feat: Add per-chat MCP server overrides 2026-02-13 13:17:32 +01:00
Aleksander Grygier 54374edecd chore: update webui build output 2026-02-13 13:17:32 +01:00
Aleksander Grygier b763a4cc69 feat: Add image load error fallback in MarkdownContent 2026-02-13 13:17:32 +01:00
Aleksander Grygier af9a76b6dc feat: Implement lazy MCP client shutdown 2026-02-13 13:17:32 +01:00
Aleksander Grygier c7870a3903 feat: Enhance tool call streaming UI and output format 2026-02-13 13:17:32 +01:00
Aleksander Grygier fb5e464fe7 feat: Display and manage servers in ChatForm actions 2026-02-13 13:17:32 +01:00
Aleksander Grygier dc7a3f33ba feat: Integrate server management dialog into chat settings 2026-02-13 13:03:15 +01:00
Aleksander Grygier 0b13c95519 feat: Implement dedicated server management UI components 2026-02-13 13:03:15 +01:00
Aleksander Grygier 8df7e4a54f refactor: Centralize health check logic in store 2026-02-13 13:03:15 +01:00
Aleksander Grygier 9a8cae462e feat: Enhance server config with headers and schema normalization 2026-02-13 13:03:15 +01:00
Aleksander Grygier bc2d879dea feat: Add McpLogo Svelte component 2026-02-13 13:03:15 +01:00
Aleksander Grygier 42d52605d9 refactor: Consolidate UI CSS classes into shared module 2026-02-13 13:03:15 +01:00
Aleksander Grygier 6c95020b06 chore: update webui build output 2026-02-13 12:57:23 +01:00
Aleksander Grygier 62dbc9f654 feat: Raw LLM output switch per message 2026-02-13 12:57:23 +01:00
Aleksander Grygier 284425097b refactor: Tool call handling 2026-02-13 12:57:03 +01:00
Aleksander Grygier 5beeb88a37 docs: Update high-level architecture diagrams for MCP integration 2026-02-13 12:55:42 +01:00
Aleksander Grygier acdd30e3af feat: Add AgenticContent component for enhanced tool call rendering 2026-02-13 12:55:42 +01:00
Aleksander Grygier 49a8c8b148 refactor: Update ChatStore to leverage mcpStore for agentic flow 2026-02-13 12:55:42 +01:00
Aleksander Grygier 5b582beb75 feat: Implement agentic orchestration within ChatService 2026-02-13 12:55:03 +01:00
Aleksander Grygier 391479edb2 feat: Introduce reactive mcpStore for client lifecycle management 2026-02-13 12:55:03 +01:00
Aleksander Grygier 7e184c174d feat: Refactor MCP client to use official SDK 2026-02-13 12:55:03 +01:00
Aleksander Grygier 1a041a5b9b feat: Add @modelcontextprotocol/sdk and zod dependencies 2026-02-13 12:55:03 +01:00
Aleksander Grygier 2325d2a50d refactor: Update Agentic and MCP config parsing to use new utils and constants 2026-02-13 12:55:03 +01:00
Aleksander Grygier 0c24db3178 feat: Centralize MCP and Agentic type definitions and constants 2026-02-13 12:55:02 +01:00
Aleksander Grygier 26a19183b7 feat: Introduce common utility functions 2026-02-13 12:55:02 +01:00
Pascal 14f6728ef1 webui: use normalizedMessages after upstream refactor 2026-02-13 12:55:02 +01:00
Pascal cb99ed9f71 webui: MCP client with low coupling to current codebase 2026-02-13 12:55:02 +01:00
Aleksander Grygier 5174d7206f
webui: UI and routing fixes (#19586)
* chore: update webui build output

* chore: update webui build output

* fix: Scroll issues in DropdownMenuSearchable

* webui: fix redirect to root ignoring base path

* fix: Word wrapping

* fix: remove obsolete modality UI tests causing CI failures

- Remove VisionModality/AudioModality test stories
- Remove mockServerProps usage and imports
- Simplify Default test (remove dropdown interaction checks)
- Simplify FileAttachments test (remove mocks)

* feat: Improve formatting performance time

---------

Co-authored-by: Pascal <admin@serveurperso.com>
2026-02-13 12:31:00 +01:00
Aleksander Grygier 4c61875bf8
webui: Add switcher to Chat Message UI to show raw LLM output (#19571) 2026-02-12 19:55:51 +01:00
Aleksander Grygier 4d688f9ebb
(webui) FEATURE: Enable adding or injecting System Message into chat (#19556)
* feat: Enable adding System Prompt per-chat

* fix: Save draft message in Chat Form when adding System Prompt from new chat view

* fix: Proper system message deletion logic

* chore: Formatting

* chore: update webui build output
2026-02-12 13:56:08 +01:00
Aleksander Grygier f486ce9f30
(webui) REFACTOR: UI primitives and polish (#19551)
* webui: UI primitives and polish (non-MCP)

* chore: update webui build output
2026-02-12 12:21:00 +01:00
Aleksander Grygier 38adc7d469
WebUI Architecture Cleanup (#19541)
* webui: architecture foundation (non-MCP core refactors)

* chore: update webui build output
2026-02-12 11:22:27 +01:00
RichardScottOZ fa16e517a3
server : fix typo in README.md for features list (#19510)
extra l for full
2026-02-12 08:56:25 +01:00
손희준 820ebfa6f4
Server: log when converting requests to chat completions format (#19457)
* Log converting requests

* Print as debug instead of info [no ci]

---------

Co-authored-by: openingnow <>
2026-02-09 16:22:57 +01:00
Sascha Rogmann 292f6908cd
spec : remove check rate (#19377)
* spec: remove parameter spec-ngram-check-rate

* spec : renamed statistics vars

* spec : add n_call_begin, n_call_accept

* spec : don't enable key-map-stats
2026-02-09 15:30:50 +02:00
Georgi Gerganov eb449cdfa4
server : improve context checkpoint logic (#19408) 2026-02-08 09:40:04 +02:00
Georgi Gerganov dfde5993ea
common : add common_speculative_is_compat() (#19270)
* llama : add llama_memory_can_rm_suffix()

* Revert "llama : add llama_memory_can_rm_suffix()"

This reverts commit d30e59b62a.

* spec : check if the target context is compatible for spec decoding
2026-02-06 16:47:22 +02:00
Matthieu Coudron a3fa035822
server: print actual model name in 'model not found" error (#19117)
Experimenting with AI, my environment gets messy fast and it's not
always easy to know what model my software is trying to load. This helps
with troubleshooting.

before:

Error: {
  code = 400,
  message = "model not found",
  type = "invalid_request_error"
}

After:

Error: {
  code = 400,
  message = "model 'toto' not found",
  type = "invalid_request_error"
}
2026-02-02 16:55:27 +01:00
Christian Kastner 7a4ca3cbd9
docs : Minor cleanups (#19252)
* Update old URLs to github.com/ggml-org/

* Bump copyrights
2026-02-02 08:38:55 +02:00
Georgi Gerganov bbada8bfb9
server : wrap around the "id_slot" parameter (#19207)
* server : wrap around the "id_slot" parameter

* cont : minor
2026-01-30 19:46:10 +02:00
Georgi Gerganov dabaa2e77a
spec : add ngram-mod (#19164)
* spec : add ngram-mod

* cont : simplify + keep track of occupancy

* cont : cleanup

* cont : move initialization to common/speculative

* cont : cleanup

* cont : cleanup

* cont : fix
2026-01-30 18:21:48 +02:00
Andrew Marshall 84b0a98319
webui: Update Svelte to fix effect_update_depth_exceeded errors (#19144)
The upstream fix is first available in 5.38.2, so constrain to at least
that version.

Rebuild pre-compiled webui index.html.gz based on these changes.

See also:
https://github.com/ggml-org/llama.cpp/issues/16347
https://github.com/huntabyte/bits-ui/issues/1687
https://github.com/sveltejs/svelte/issues/16548
2026-01-29 15:56:39 +01:00
Sascha Rogmann 72d3b1898a
spec : add self‑speculative decoding (no draft model required) + refactor (#18471)
* server: introduce self-speculative decoding

* server: moved self-call into speculative.cpp

* can_speculate() includes self-speculation

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* server: can_speculate() tests self-spec

* server: replace can_speculate() with slot.can_speculate()

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* common: use %zu format specifier for size_t in logging

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* server: can_speculate() requires a task instance

* common: ngram map, config self-speculative decoding

* common: add enum common_speculative_type

* common: add vector of speculative states

* common: add option --spec-draftless

* server: cleanup (remove slot.batch_spec, rename)

* common: moved self-spec impl to ngram-map

* common: cleanup (use common_speculative_state_draft)

* spec : refactor

* cont : naming

* spec: remove --spec-config

* doc: (draftless) speculative decoding

* common: print performance in spec decoding

* minor : cleanup

* common : better names

* minor : cleanup + fix build

* minor: comments

* CODEOWNERS: add common/ngram-map.* (#18471)

* common : rename speculative.draftless_type -> speculative.type

* ngram-map : fix uninitialized values

* ngram-map : take into account the input can become shorter

* ngram-map : revert len check for now

* arg : change `--spec-draftless` -> `--spec-type`

* spec : add common_speculative_state::accept()

* spec : refactor + add common_speculative_begin()

* spec : fix begin() call with mtmd

* spec : additional refactor + remove common_speculative_params

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-01-28 19:42:42 +02:00
Georgi Gerganov b931f81b5a
server : adjust spec tests to generate up to 16 tokens (#19093) 2026-01-28 09:11:40 +02:00
Daniel Bevenius 16639ba217
common : use two decimal places for float arg help messages (#19048)
* common : use two decimal places for float arg help messages

This commit updates the help messages for various command-line arguments
in arg.cpp to display floating-point default values with two decimal
places instead of one.

The motivation for this changes is that currently only having one decimal
place means that values generated using --help or llama-gen-docs will not
display the correct values.

For example, currently the value of top-p in tools/server/README.md is
`0.9`, but the default value is actually '0.95'. And running
llama-gen-docs does not update this value as it uses the output from the
help message, which shows only one decimal place, so the values look
like they are unchanged.

* docs : run llama-gen-docs to update docs
2026-01-25 07:31:42 +01:00
Xuan-Son Nguyen 51fa458a92
server : support preserving reasoning_content in assistant message (#18994)
* support reasoning_content input

* report template caps to webui

* add docs

* rm commented code
2026-01-22 21:30:06 +01:00
Xuan-Son Nguyen 4e595b250a
server: do not log certain endpoints (avoid log spam) (#19028) 2026-01-22 19:24:37 +01:00
손희준 c6926d1d95
server: Reorder methods in `server-task.cpp` (#19016)
* Move `task_result_state::update_chat_msg` to match with header

* Move `server_task_result_cmpl_partial::to_json_anthropic()` to match with header

---------

Co-authored-by: openingnow <>
2026-01-22 14:36:04 +01:00
Hendrik Erz 3802d3c78f
fix: Use `tabular-nums` for chat message statistics (#18915)
* fix: Use `tabular-nums` for chat message statistics

* fix: Rebuild WebUI
2026-01-21 18:46:01 +01:00
손희준 fbbf3ad190
server: /v1/responses (partial) (#18486)
* from previous PR

* Make instruction(system) as first message

* Convert [input_message] (text/image/file)

* Rename convert_responses_to_chatcmpl(body) -> response_body

* Initial tool call support

* Erase instructions field from chatcmpl body

* Feed reasoning texts to chat template

* Use std::vector instead of opaque json array

* Make output_item.added events consistent

* Move `server_task_result_cmpl_partial::update` from header to source

* Match ID of output_item.added and .done events

* Add function_call only if there is no "fc_" prefix

* Add function call output at non-streaming API

* Test if ID is persistent

* Add doc

* Fix style - use trailing comma

* Rewrite state management

* catch up with upstream/master

* Fix style - "type" is the first item of SSE data

* Explicitly check "instructions" from response_body

* Make lambdas static

* Check if reasoning content exists

* Add `oai_resp_id` to task_result_state(also initialized at ctor), server_task_result_cmpl_partial, and server_task_result_cmpl_final

* Reject `input_file` since it is not supported by chatcmpl

* Add "fc_" prefix to non-straming function call id as coderabbit pointed out

---------

Co-authored-by: openingnow <>
2026-01-21 17:47:23 +01:00
Adrien Gallouët 1c7cf94b22
common, server : use the same User-Agent by default (#18957)
This commit also ensures that if a custom User-Agent is used, it will be
the only one sent.

Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2026-01-20 18:28:43 +01:00
Xuan-Son Nguyen 2c1f199653
cli : fix reasoning responses in CLI (#18961)
* cli : fix reasoning responses in CLI

* fix build

* fix build (2)
2026-01-20 18:23:25 +01:00
Xuan-Son Nguyen 6df686bee6
server : refactor oai_parser_opt, move it to server_chat_params (#18937)
* server_chat_params

* move chat format into CLI

* use meta whenever possible

* clean up, no more chatml fallback
2026-01-19 23:28:01 +01:00
Lennart Austenfeld 18361c579c
server: fix memory reservations in populate_token_probs (#18787) 2026-01-19 19:13:31 +01:00