Aleksander Grygier
72e5d9ae2a
chore: update webui build output
2026-02-13 13:35:58 +01:00
Aleksander Grygier
dd1fe96e18
feat: Improve formatting performance time
2026-02-13 13:35:58 +01:00
Aleksander Grygier
eed0c5ae48
fix: System prompt sorting
2026-02-13 13:35:58 +01:00
Aleksander Grygier
16aa6fae0a
fix: Save draft message in Chat Form when adding System Prompt from new chat view
2026-02-13 13:33:06 +01:00
Aleksander Grygier
0fe25847ff
fix: Chat Form submission
2026-02-13 13:33:06 +01:00
Aleksander Grygier
ed70cb577d
chore: update webui build output
2026-02-13 13:33:05 +01:00
Aleksander Grygier
141540ccbb
feat: MCP Prompts WIP
2026-02-13 13:33:05 +01:00
Aleksander Grygier
46ced87178
chore: update webui build output
2026-02-13 13:32:47 +01:00
Aleksander Grygier
43da6b8676
feat: UI improvements
2026-02-13 13:32:47 +01:00
Aleksander Grygier
17b326b32a
chore: update webui build output
2026-02-13 13:30:16 +01:00
Aleksander Grygier
aaeea933b7
feat: Architectural improvements
2026-02-13 13:30:16 +01:00
Aleksander Grygier
da252e3425
feat: Per-conversation agentic loop state
2026-02-13 13:28:24 +01:00
Aleksander Grygier
1565cda1ff
chore: update webui build output
2026-02-13 13:28:24 +01:00
Aleksander Grygier
a8c2e66e92
feat: Improve MCP Server selection UI + lazy load health checks
2026-02-13 13:28:24 +01:00
Aleksander Grygier
f8d6d16df1
feat: UI improvements
2026-02-13 13:21:35 +01:00
Aleksander Grygier
690dd09b5f
feat: Simplify MCP server enabling logic per chat
...
Refactors MCP server enabling logic to remove the dependency on global settings.
This simplifies the logic by directly checking the per-chat override status, and removes the need to pass the global enabled state as a parameter.
Additionally:
- Only shows MCP servers that are enabled in settings in the selector.
- Sorts the servers by whether they are enabled for the current chat.
2026-02-13 13:21:35 +01:00
Aleksander Grygier
a12304cdea
chore: update webui build output
2026-02-13 13:21:35 +01:00
Aleksander Grygier
52f21b4ca4
fix: Missing onModelChange callback running assistant message re-generation
2026-02-13 13:21:35 +01:00
Pascal
20e5e70c61
chore: update webui build output
2026-02-13 13:21:35 +01:00
Pascal
a2cce59d69
fix: acurate tool_response display
2026-02-13 13:21:35 +01:00
Pascal
fdd67f45e6
fix: unify MCP server label logic with simplified fallback
2026-02-13 13:21:35 +01:00
Pascal
bdd9bcfb75
chore: update webui build output
2026-02-13 13:21:35 +01:00
Pascal
a515179730
refactor: remove multimodal validation from model selector
...
Remove all frontend validation logic that prevented users from selecting
models based on multimodal capabilities. This refactoring removes
restrictive UI code while maintaining full functionality
- Vision models can describe images as text
- That text remains useful for non-vision models
- Chaining vision -> non-vision is a valid workflow
- Users know their use case better than the UI
- Users can return to vision models when needed
2026-02-13 13:21:35 +01:00
Pascal
c7e76c65d1
chore: update webui build output
2026-02-13 13:21:35 +01:00
Pascal
37c084873c
fix: ignore assistant attachments (MCP) for modality detection
2026-02-13 13:21:35 +01:00
Pascal
d09cdfaf0a
chore: update webui build output
2026-02-13 13:21:35 +01:00
Pascal
6d41f74031
refactor: eliminate MCP circular dependency
...
- Change architecture from mcpStore <-> mcpClient to mcpClient -> mcpStore
- Remove bidirectional callback pattern (set*Callback, notify* methods)
- Add updateState/updateHealthCheck public methods in mcpStore
- Replace callback calls with direct mcpStore method calls
- Remove unused imports (browser, HealthCheckState) and constructor
- Fixes CI: ReferenceError Cannot access mcpClient before initialization
2026-02-13 13:21:35 +01:00
Pascal
07ae189175
chore: update webui build output
2026-02-13 13:21:34 +01:00
Pascal
23741b3c6a
fix: strip reasoning content and UI proprietary tags from prompts
...
TODO: add toggle and ensure backend API compliance for reasoning format
2026-02-13 13:21:34 +01:00
Pascal
b5b527fa52
chore: update webui build output
2026-02-13 13:21:34 +01:00
Pascal
fb1ec29898
refactor: remove reasoning after first turn filter
2026-02-13 13:21:34 +01:00
Pascal
fc5d9f587f
refactor: inline reasoning with tags, remove fixed thinking field
2026-02-13 13:21:34 +01:00
Pascal
6b3bc23fc2
chore: update webui build output
2026-02-13 13:21:34 +01:00
Pascal
c73baed7e3
feat: resolve MCP attachment images via rehype plugin
...
LLM can reference tool-generated images using markdown links like,
plugin resolves attachment names to base64 from message.extra when present,
regular HTTP/data URLs pass through unchanged (no regression)
- rehypeResolveAttachmentImages plugin in markdown pipeline
- Pass message prop to MarkdownContent and AgenticContent
- Force processor reactivity on message.extra changes
- Filter assistant images from API context (display-only)
2026-02-13 13:21:34 +01:00
Pascal
09381a59fd
feat: persist base64 attachments from tool results
2026-02-13 13:21:34 +01:00
Pascal
f16457551e
webui: fix custom headers persistence in UI (derived)
2026-02-13 13:21:34 +01:00
Pascal
f42e5f114e
webui: fix custom headers persistence in UI
2026-02-13 13:21:34 +01:00
Aleksander Grygier
162bd976ed
fix: Word wrapping
2026-02-13 13:21:34 +01:00
Aleksander Grygier
c2dd1d2fed
chore: update webui build output
2026-02-13 13:21:34 +01:00
Aleksander Grygier
008463149b
feat: UI improvements
2026-02-13 13:21:34 +01:00
Aleksander Grygier
1dba2ec4a9
chore: update webui build output
2026-02-13 13:21:34 +01:00
Aleksander Grygier
805c171825
feat: UI improvement
2026-02-13 13:21:34 +01:00
Aleksander Grygier
d6455a7530
chore: update webui build output
2026-02-13 13:21:34 +01:00
Aleksander Grygier
bb4bd7fe09
chore: update webui build output
2026-02-13 13:21:34 +01:00
Aleksander Grygier
05dfb5e70c
chore: update webui build output
2026-02-13 13:21:34 +01:00
Aleksander Grygier
cad9ca1208
feat: MCP Server Details
2026-02-13 13:21:34 +01:00
Aleksander Grygier
0e980bf881
chore: update webui build output
2026-02-13 13:21:34 +01:00
Aleksander Grygier
825d2ea9a9
feat: MCP connection details WIP
2026-02-13 13:21:34 +01:00
Aleksander Grygier
2b37f70c37
refactor: MCP types and health check
2026-02-13 13:21:34 +01:00
Aleksander Grygier
36a37d1794
chore: update webui build output
2026-02-13 13:21:34 +01:00
Aleksander Grygier
38ba6d8372
refactor: KeyValuePairs component
2026-02-13 13:21:34 +01:00
Aleksander Grygier
c5465d4893
chore: update webui build output
2026-02-13 13:21:34 +01:00
Aleksander Grygier
57089370e4
refactor: DRY
2026-02-13 13:21:34 +01:00
Aleksander Grygier
f80d5f615e
chore: update webui build output
2026-02-13 13:21:34 +01:00
Aleksander Grygier
e1da51335c
refactor: Architecture improvements
2026-02-13 13:21:34 +01:00
Aleksander Grygier
3bc8d93546
chore: update webui build output
2026-02-13 13:21:34 +01:00
Aleksander Grygier
48b2b1b2f0
refactor: MCP state management + stores/clients relationship
2026-02-13 13:21:34 +01:00
Aleksander Grygier
2cd682178b
chore: update webui build output
2026-02-13 13:21:34 +01:00
Aleksander Grygier
da8baaa9b8
fix: Distinguish streaming vs incomplete tool calls in UI
2026-02-13 13:21:34 +01:00
Aleksander Grygier
3179858e5f
chore: update webui build output
2026-02-13 13:21:34 +01:00
Aleksander Grygier
9471729162
fix: Restore live reactive UI progress for tool calls
2026-02-13 13:21:34 +01:00
Aleksander Grygier
64923b20be
chore: update webui build output
2026-02-13 13:21:34 +01:00
Pascal
179477b4ed
fix: reset tool call state between turns
2026-02-13 13:21:34 +01:00
Pascal
38244a1bfa
webui: enable streaming of tool call arguments
2026-02-13 13:21:34 +01:00
Aleksander Grygier
2faf237d01
chore: update webui build output
2026-02-13 13:21:34 +01:00
Aleksander Grygier
5ffb6aba3a
refactor: Cleanup
2026-02-13 13:21:34 +01:00
Pascal
96e51e2a41
webui: prevent mobile dropdown immediate close on synthetic click
2026-02-13 13:20:42 +01:00
Pascal
8916698294
webui: fix redirect to root ignoring base path
2026-02-13 13:20:42 +01:00
Aleksander Grygier
2a33fc2059
refactor: Cleanup
2026-02-13 13:20:41 +01:00
Aleksander Grygier
04913f20d9
chore: update webui build output
2026-02-13 13:20:41 +01:00
Aleksander Grygier
939e7aa16b
refactor: Types
2026-02-13 13:20:41 +01:00
Aleksander Grygier
bef865d871
refactor: Componentize McpServerCard
2026-02-13 13:20:41 +01:00
Aleksander Grygier
7dbb05a160
refactor: Cleanup
2026-02-13 13:20:41 +01:00
Aleksander Grygier
7e194f653a
fix: Remove redundant CSS class
2026-02-13 13:20:41 +01:00
Aleksander Grygier
02c87fa3c9
feat: Add TruncatedText component
2026-02-13 13:20:41 +01:00
Aleksander Grygier
27b80ae3e8
fix: Collapsible box trigger
2026-02-13 13:20:26 +01:00
Aleksander Grygier
408e098324
refactor: Cleanup
2026-02-13 13:20:26 +01:00
Aleksander Grygier
0b36d04c38
refactor: Cleanup
2026-02-13 13:20:07 +01:00
Aleksander Grygier
df464c1f5a
refactor: Collapsible Content Block & small fixes
2026-02-13 13:18:20 +01:00
Aleksander Grygier
26044454ef
chore: update webui build output
2026-02-13 13:18:20 +01:00
Aleksander Grygier
f0ac6fa039
refactor: Cleanup
2026-02-13 13:18:20 +01:00
Aleksander Grygier
7c9ba36216
chore: update webui build output
2026-02-13 13:18:20 +01:00
Aleksander Grygier
7ab269cd77
feat: UI improvements
2026-02-13 13:18:20 +01:00
Aleksander Grygier
e0122465ed
feat: Always show Mcp Selector
2026-02-13 13:18:20 +01:00
Pascal
36c9ad9303
fix: remove double scrollbar in model selector by using Bits UI content available height
2026-02-13 13:18:20 +01:00
Aleksander Grygier
bc60beb1a7
feat: Enable adding System Prompt per-chat
2026-02-13 13:18:20 +01:00
Aleksander Grygier
276a3e9416
fix: UI
2026-02-13 13:17:51 +01:00
Aleksander Grygier
c74065de75
chore: update webui build output
2026-02-13 13:17:51 +01:00
Aleksander Grygier
e6ad864984
feat: UI improvements
2026-02-13 13:17:51 +01:00
Pascal
cff237cb3e
webui: raw tool result display, strip only leading/trailing newlines to preserve indentation
2026-02-13 13:17:33 +01:00
Pascal
afb79b2970
webui: split raw output into backend parsing and frontend display options
2026-02-13 13:17:33 +01:00
Pascal
18efdabb12
webui: remove legacy wrapper and restore WebSocket transport
2026-02-13 13:17:33 +01:00
Pascal
a13782a4d1
webui: remove unused imports
2026-02-13 13:17:33 +01:00
Aleksander Grygier
d548bf27dd
chore: update webui build output
2026-02-13 13:17:33 +01:00
Aleksander Grygier
bdd5958f6d
feat: Improve agentic tool call streaming display with 'in progress' state
2026-02-13 13:17:32 +01:00
Aleksander Grygier
a9c2ea7a8e
feat: Enhance MCP server dropdown with search, popularity sorting, and per-chat overrides
2026-02-13 13:17:32 +01:00
Aleksander Grygier
dfce09b34b
feat: Add per-chat MCP server overrides
2026-02-13 13:17:32 +01:00
Aleksander Grygier
54374edecd
chore: update webui build output
2026-02-13 13:17:32 +01:00
Aleksander Grygier
b763a4cc69
feat: Add image load error fallback in MarkdownContent
2026-02-13 13:17:32 +01:00
Aleksander Grygier
af9a76b6dc
feat: Implement lazy MCP client shutdown
2026-02-13 13:17:32 +01:00
Aleksander Grygier
c7870a3903
feat: Enhance tool call streaming UI and output format
2026-02-13 13:17:32 +01:00
Aleksander Grygier
fb5e464fe7
feat: Display and manage servers in ChatForm actions
2026-02-13 13:17:32 +01:00
Aleksander Grygier
dc7a3f33ba
feat: Integrate server management dialog into chat settings
2026-02-13 13:03:15 +01:00
Aleksander Grygier
0b13c95519
feat: Implement dedicated server management UI components
2026-02-13 13:03:15 +01:00
Aleksander Grygier
8df7e4a54f
refactor: Centralize health check logic in store
2026-02-13 13:03:15 +01:00
Aleksander Grygier
9a8cae462e
feat: Enhance server config with headers and schema normalization
2026-02-13 13:03:15 +01:00
Aleksander Grygier
bc2d879dea
feat: Add McpLogo Svelte component
2026-02-13 13:03:15 +01:00
Aleksander Grygier
42d52605d9
refactor: Consolidate UI CSS classes into shared module
2026-02-13 13:03:15 +01:00
Aleksander Grygier
6c95020b06
chore: update webui build output
2026-02-13 12:57:23 +01:00
Aleksander Grygier
62dbc9f654
feat: Raw LLM output switch per message
2026-02-13 12:57:23 +01:00
Aleksander Grygier
284425097b
refactor: Tool call handling
2026-02-13 12:57:03 +01:00
Aleksander Grygier
5beeb88a37
docs: Update high-level architecture diagrams for MCP integration
2026-02-13 12:55:42 +01:00
Aleksander Grygier
acdd30e3af
feat: Add AgenticContent component for enhanced tool call rendering
2026-02-13 12:55:42 +01:00
Aleksander Grygier
49a8c8b148
refactor: Update ChatStore to leverage mcpStore for agentic flow
2026-02-13 12:55:42 +01:00
Aleksander Grygier
5b582beb75
feat: Implement agentic orchestration within ChatService
2026-02-13 12:55:03 +01:00
Aleksander Grygier
391479edb2
feat: Introduce reactive mcpStore for client lifecycle management
2026-02-13 12:55:03 +01:00
Aleksander Grygier
7e184c174d
feat: Refactor MCP client to use official SDK
2026-02-13 12:55:03 +01:00
Aleksander Grygier
1a041a5b9b
feat: Add @modelcontextprotocol/sdk and zod dependencies
2026-02-13 12:55:03 +01:00
Aleksander Grygier
2325d2a50d
refactor: Update Agentic and MCP config parsing to use new utils and constants
2026-02-13 12:55:03 +01:00
Aleksander Grygier
0c24db3178
feat: Centralize MCP and Agentic type definitions and constants
2026-02-13 12:55:02 +01:00
Aleksander Grygier
26a19183b7
feat: Introduce common utility functions
2026-02-13 12:55:02 +01:00
Pascal
14f6728ef1
webui: use normalizedMessages after upstream refactor
2026-02-13 12:55:02 +01:00
Pascal
cb99ed9f71
webui: MCP client with low coupling to current codebase
2026-02-13 12:55:02 +01:00
Aleksander Grygier
5174d7206f
webui: UI and routing fixes ( #19586 )
...
* chore: update webui build output
* chore: update webui build output
* fix: Scroll issues in DropdownMenuSearchable
* webui: fix redirect to root ignoring base path
* fix: Word wrapping
* fix: remove obsolete modality UI tests causing CI failures
- Remove VisionModality/AudioModality test stories
- Remove mockServerProps usage and imports
- Simplify Default test (remove dropdown interaction checks)
- Simplify FileAttachments test (remove mocks)
* feat: Improve formatting performance time
---------
Co-authored-by: Pascal <admin@serveurperso.com>
2026-02-13 12:31:00 +01:00
Aleksander Grygier
4c61875bf8
webui: Add switcher to Chat Message UI to show raw LLM output ( #19571 )
2026-02-12 19:55:51 +01:00
Aleksander Grygier
4d688f9ebb
(webui) FEATURE: Enable adding or injecting System Message into chat ( #19556 )
...
* feat: Enable adding System Prompt per-chat
* fix: Save draft message in Chat Form when adding System Prompt from new chat view
* fix: Proper system message deletion logic
* chore: Formatting
* chore: update webui build output
2026-02-12 13:56:08 +01:00
Aleksander Grygier
f486ce9f30
(webui) REFACTOR: UI primitives and polish ( #19551 )
...
* webui: UI primitives and polish (non-MCP)
* chore: update webui build output
2026-02-12 12:21:00 +01:00
Aleksander Grygier
38adc7d469
WebUI Architecture Cleanup ( #19541 )
...
* webui: architecture foundation (non-MCP core refactors)
* chore: update webui build output
2026-02-12 11:22:27 +01:00
RichardScottOZ
fa16e517a3
server : fix typo in README.md for features list ( #19510 )
...
extra l for full
2026-02-12 08:56:25 +01:00
손희준
820ebfa6f4
Server: log when converting requests to chat completions format ( #19457 )
...
* Log converting requests
* Print as debug instead of info [no ci]
---------
Co-authored-by: openingnow <>
2026-02-09 16:22:57 +01:00
Sascha Rogmann
292f6908cd
spec : remove check rate ( #19377 )
...
* spec: remove parameter spec-ngram-check-rate
* spec : renamed statistics vars
* spec : add n_call_begin, n_call_accept
* spec : don't enable key-map-stats
2026-02-09 15:30:50 +02:00
Georgi Gerganov
eb449cdfa4
server : improve context checkpoint logic ( #19408 )
2026-02-08 09:40:04 +02:00
Georgi Gerganov
dfde5993ea
common : add common_speculative_is_compat() ( #19270 )
...
* llama : add llama_memory_can_rm_suffix()
* Revert "llama : add llama_memory_can_rm_suffix()"
This reverts commit d30e59b62a .
* spec : check if the target context is compatible for spec decoding
2026-02-06 16:47:22 +02:00
Matthieu Coudron
a3fa035822
server: print actual model name in 'model not found" error ( #19117 )
...
Experimenting with AI, my environment gets messy fast and it's not
always easy to know what model my software is trying to load. This helps
with troubleshooting.
before:
Error: {
code = 400,
message = "model not found",
type = "invalid_request_error"
}
After:
Error: {
code = 400,
message = "model 'toto' not found",
type = "invalid_request_error"
}
2026-02-02 16:55:27 +01:00
Christian Kastner
7a4ca3cbd9
docs : Minor cleanups ( #19252 )
...
* Update old URLs to github.com/ggml-org/
* Bump copyrights
2026-02-02 08:38:55 +02:00
Georgi Gerganov
bbada8bfb9
server : wrap around the "id_slot" parameter ( #19207 )
...
* server : wrap around the "id_slot" parameter
* cont : minor
2026-01-30 19:46:10 +02:00
Georgi Gerganov
dabaa2e77a
spec : add ngram-mod ( #19164 )
...
* spec : add ngram-mod
* cont : simplify + keep track of occupancy
* cont : cleanup
* cont : move initialization to common/speculative
* cont : cleanup
* cont : cleanup
* cont : fix
2026-01-30 18:21:48 +02:00
Andrew Marshall
84b0a98319
webui: Update Svelte to fix effect_update_depth_exceeded errors ( #19144 )
...
The upstream fix is first available in 5.38.2, so constrain to at least
that version.
Rebuild pre-compiled webui index.html.gz based on these changes.
See also:
https://github.com/ggml-org/llama.cpp/issues/16347
https://github.com/huntabyte/bits-ui/issues/1687
https://github.com/sveltejs/svelte/issues/16548
2026-01-29 15:56:39 +01:00
Sascha Rogmann
72d3b1898a
spec : add self‑speculative decoding (no draft model required) + refactor ( #18471 )
...
* server: introduce self-speculative decoding
* server: moved self-call into speculative.cpp
* can_speculate() includes self-speculation
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* server: can_speculate() tests self-spec
* server: replace can_speculate() with slot.can_speculate()
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* common: use %zu format specifier for size_t in logging
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* server: can_speculate() requires a task instance
* common: ngram map, config self-speculative decoding
* common: add enum common_speculative_type
* common: add vector of speculative states
* common: add option --spec-draftless
* server: cleanup (remove slot.batch_spec, rename)
* common: moved self-spec impl to ngram-map
* common: cleanup (use common_speculative_state_draft)
* spec : refactor
* cont : naming
* spec: remove --spec-config
* doc: (draftless) speculative decoding
* common: print performance in spec decoding
* minor : cleanup
* common : better names
* minor : cleanup + fix build
* minor: comments
* CODEOWNERS: add common/ngram-map.* (#18471 )
* common : rename speculative.draftless_type -> speculative.type
* ngram-map : fix uninitialized values
* ngram-map : take into account the input can become shorter
* ngram-map : revert len check for now
* arg : change `--spec-draftless` -> `--spec-type`
* spec : add common_speculative_state::accept()
* spec : refactor + add common_speculative_begin()
* spec : fix begin() call with mtmd
* spec : additional refactor + remove common_speculative_params
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-01-28 19:42:42 +02:00
Georgi Gerganov
b931f81b5a
server : adjust spec tests to generate up to 16 tokens ( #19093 )
2026-01-28 09:11:40 +02:00
Daniel Bevenius
16639ba217
common : use two decimal places for float arg help messages ( #19048 )
...
* common : use two decimal places for float arg help messages
This commit updates the help messages for various command-line arguments
in arg.cpp to display floating-point default values with two decimal
places instead of one.
The motivation for this changes is that currently only having one decimal
place means that values generated using --help or llama-gen-docs will not
display the correct values.
For example, currently the value of top-p in tools/server/README.md is
`0.9`, but the default value is actually '0.95'. And running
llama-gen-docs does not update this value as it uses the output from the
help message, which shows only one decimal place, so the values look
like they are unchanged.
* docs : run llama-gen-docs to update docs
2026-01-25 07:31:42 +01:00
Xuan-Son Nguyen
51fa458a92
server : support preserving reasoning_content in assistant message ( #18994 )
...
* support reasoning_content input
* report template caps to webui
* add docs
* rm commented code
2026-01-22 21:30:06 +01:00
Xuan-Son Nguyen
4e595b250a
server: do not log certain endpoints (avoid log spam) ( #19028 )
2026-01-22 19:24:37 +01:00
손희준
c6926d1d95
server: Reorder methods in `server-task.cpp` ( #19016 )
...
* Move `task_result_state::update_chat_msg` to match with header
* Move `server_task_result_cmpl_partial::to_json_anthropic()` to match with header
---------
Co-authored-by: openingnow <>
2026-01-22 14:36:04 +01:00
Hendrik Erz
3802d3c78f
fix: Use `tabular-nums` for chat message statistics ( #18915 )
...
* fix: Use `tabular-nums` for chat message statistics
* fix: Rebuild WebUI
2026-01-21 18:46:01 +01:00
손희준
fbbf3ad190
server: /v1/responses (partial) ( #18486 )
...
* from previous PR
* Make instruction(system) as first message
* Convert [input_message] (text/image/file)
* Rename convert_responses_to_chatcmpl(body) -> response_body
* Initial tool call support
* Erase instructions field from chatcmpl body
* Feed reasoning texts to chat template
* Use std::vector instead of opaque json array
* Make output_item.added events consistent
* Move `server_task_result_cmpl_partial::update` from header to source
* Match ID of output_item.added and .done events
* Add function_call only if there is no "fc_" prefix
* Add function call output at non-streaming API
* Test if ID is persistent
* Add doc
* Fix style - use trailing comma
* Rewrite state management
* catch up with upstream/master
* Fix style - "type" is the first item of SSE data
* Explicitly check "instructions" from response_body
* Make lambdas static
* Check if reasoning content exists
* Add `oai_resp_id` to task_result_state(also initialized at ctor), server_task_result_cmpl_partial, and server_task_result_cmpl_final
* Reject `input_file` since it is not supported by chatcmpl
* Add "fc_" prefix to non-straming function call id as coderabbit pointed out
---------
Co-authored-by: openingnow <>
2026-01-21 17:47:23 +01:00
Adrien Gallouët
1c7cf94b22
common, server : use the same User-Agent by default ( #18957 )
...
This commit also ensures that if a custom User-Agent is used, it will be
the only one sent.
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2026-01-20 18:28:43 +01:00
Xuan-Son Nguyen
2c1f199653
cli : fix reasoning responses in CLI ( #18961 )
...
* cli : fix reasoning responses in CLI
* fix build
* fix build (2)
2026-01-20 18:23:25 +01:00
Xuan-Son Nguyen
6df686bee6
server : refactor oai_parser_opt, move it to server_chat_params ( #18937 )
...
* server_chat_params
* move chat format into CLI
* use meta whenever possible
* clean up, no more chatml fallback
2026-01-19 23:28:01 +01:00
Lennart Austenfeld
18361c579c
server: fix memory reservations in populate_token_probs ( #18787 )
2026-01-19 19:13:31 +01:00