Aleksander Grygier
db8ed5df9c
feat: Model unavailable UI state for model selector
2025-11-22 19:02:50 +01:00
Aleksander Grygier
076eec6d60
feat: Add copy to clipboard to model name in model info dialog
2025-11-22 19:00:05 +01:00
Aleksander Grygier
c274f132cb
refactor: Chat Form Submit component
2025-11-22 01:35:02 +01:00
Aleksander Grygier
92585c7173
feat: Attachments UX improvements
2025-11-21 21:23:20 +01:00
Aleksander Grygier
69503aa519
feat: Add auto-mic setting
2025-11-21 21:18:13 +01:00
Aleksander Grygier
6b7c0a5090
chore: update webui build output
2025-11-21 14:27:45 +01:00
Aleksander Grygier
8b1d96755e
feat: New Model Selection UX WIP
2025-11-21 14:26:50 +01:00
Aleksander Grygier
c26c3402fe
chore: update webui build output
2025-11-21 11:10:07 +01:00
Aleksander Grygier
049f40dfdf
refactor: Use only the message data `model` property for displaying model used info
2025-11-21 11:00:49 +01:00
Aleksander Grygier
45bf2a4983
Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2' into allozaur/server_model_management_v1_2
2025-11-21 09:25:17 +01:00
Aleksander Grygier
cc88f6a75b
chore: update webui build output
2025-11-21 00:08:09 +01:00
Aleksander Grygier
4bf82a10f1
feat: Improved UX for model information, modality interactions etc
2025-11-21 00:05:43 +01:00
Xuan Son Nguyen
cd5c699304
add docs (first version)
2025-11-20 21:45:05 +01:00
Xuan Son Nguyen
be25bccdff
address review comment
2025-11-20 21:37:22 +01:00
Xuan Son Nguyen
6929c9f43d
address thread safety issue
2025-11-20 18:38:02 +01:00
Xuan Son Nguyen
5369aaa1d6
address most problems
2025-11-20 18:34:22 +01:00
Aleksander Grygier
c35dee3bd7
Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2' into allozaur/server_model_management_v1_2
2025-11-20 16:36:45 +01:00
Aleksander Grygier
8a88576849
refactor: Architecture improvements
2025-11-20 16:34:25 +01:00
Xuan Son Nguyen
5805ca7960
add is_active()
2025-11-20 16:26:31 +01:00
Xuan Son Nguyen
d0ea9e0830
also allow terminate loading model
2025-11-20 16:20:14 +01:00
Xuan Son Nguyen
6610724f8e
fix unsafe pointer
2025-11-20 16:13:30 +01:00
Xuan Son Nguyen
b9ebdf616a
more stable
2025-11-20 15:49:40 +01:00
Aleksander Grygier
55d33a8b8c
feat: Model/Router server architecture WIP
2025-11-20 14:24:50 +01:00
Xuan Son Nguyen
919d3f8cbf
Merge branch 'master' into xsn/server_model_management_v1_2
2025-11-20 14:19:16 +01:00
Aleksander Grygier
4c91f2633f
Improved file naming & structure for UI components ( #17405 )
...
* refactor: Component iles naming & structure
* chore: update webui build output
* refactor: Dialog titles + components namig
* chore: update webui build output
* refactor: Imports
* chore: update webui build output
2025-11-20 14:07:31 +01:00
Xuan Son Nguyen
7c6eb17fad
fix windows
2025-11-20 13:14:56 +01:00
Georgi Gerganov
196f5083ef
common : more accurate sampling timing ( #17382 )
...
* common : more accurate sampling timing
* eval-callback : minor fixes
* cont : add time_meas impl
* cont : fix log msg [no ci]
* cont : fix multiple definitions of time_meas
* llama-cli : exclude chat template init from time measurement
* cont : print percentage of unaccounted time
* cont : do not reset timings
2025-11-20 13:40:10 +02:00
Xuan Son Nguyen
0ef3b61e82
add test
2025-11-20 00:29:59 +01:00
Xuan Son Nguyen
5423d42a35
use subprocess.h, better logging
2025-11-20 00:05:29 +01:00
Xuan Son Nguyen
54b3545791
fix windows build
2025-11-19 22:30:47 +01:00
Xuan Son Nguyen
abc0ca478a
does this fix windows?
2025-11-19 22:24:00 +01:00
Xuan Son Nguyen
399f536dc7
fix compile error
2025-11-19 21:33:44 +01:00
Xuan Son Nguyen
fc5901a449
server: add model management and proxy
2025-11-19 21:23:00 +01:00
Aleksander Grygier
99c53d6558
webui: Add a "Continue" Action for Assistant Message ( #16971 )
...
* feat: Add "Continue" action for assistant messages
* feat: Continuation logic & prompt improvements
* chore: update webui build output
* feat: Improve logic for continuing the assistant message
* chore: update webui build output
* chore: Linting
* chore: update webui build output
* fix: Remove synthetic prompt logic, use the prefill feature by sending the conversation payload ending with assistant message
* chore: update webui build output
* feat: Enable "Continue" button based on config & non-reasoning model type
* chore: update webui build output
* chore: Update packages with `npm audit fix`
* fix: Remove redundant error
* chore: update webui build output
* chore: Update `.gitignore`
* fix: Add missing change
* feat: Add auto-resizing for Edit Assistant/User Message textareas
* chore: update webui build output
2025-11-19 14:39:50 +01:00
o7si
97cb3fd5ae
fix: resolve undefined variable 'svr' compilation error ( #17348 )
2025-11-18 10:10:47 +02:00
Xuan-Son Nguyen
0de8878c96
server: split HTTP into its own interface ( #17216 )
...
* server: split HTTP into its own interface
* move server-http and httplib to its own file
* add the remaining endpoints
* fix exception/error handling
* renaming
* missing header
* fix missing windows header
* fix error responses from http layer
* fix slot save/restore handler
* fix case where only one stream chunk is returned
* add NOMINMAX
* do not call sink.write on empty data
* use safe_json_to_str for SSE
* clean up
* add some comments
* improve usage of next()
* bring back the "server is listening on" message
* more generic handler
* add req.headers
* move the chat template print to init()
* add req.path
* cont : minor
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-11-17 22:05:44 +01:00
Georgi Gerganov
5b2093becc
server : handle context overflow during decode ( #17267 )
...
* server : handle context overflow during decode
* server : minor refactor
2025-11-16 09:23:37 +02:00
Aleksander Grygier
22e1ce2f81
webui: Fix clickability around chat processing statistics UI ( #17278 )
...
* fix: Better pointer events handling in chat processing info elements
* chore: update webui build output
2025-11-15 22:41:41 +01:00
Pascal
1411d9275a
webui: add OAI-Compat Harmony tool-call streaming visualization and persistence in chat UI ( #16618 )
...
* webui: add OAI-Compat Harmony tool-call live streaming visualization and persistence in chat UI
- Purely visual and diagnostic change, no effect on model context, prompt
construction, or inference behavior
- Captured assistant tool call payloads during streaming and non-streaming
completions, and persisted them in chat state and storage for downstream use
- Exposed parsed tool call labels beneath the assistant's model info line
with graceful fallback when parsing fails
- Added tool call badges beneath assistant responses that expose JSON tooltips
and copy their payloads when clicked, matching the existing model badge styling
- Added a user-facing setting to toggle tool call visibility to the Developer
settings section directly under the model selector option
* webui: remove scroll listener causing unnecessary layout updates (model selector)
* Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
* Update tools/server/webui/src/lib/components/app/chat/ChatMessages/ChatMessageAssistant.svelte
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
* chore: npm run format & update webui build output
* chore: update webui build output
---------
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
2025-11-15 21:09:32 +01:00
Ankur Verma
c7b7db0445
mtmd-cli: Avoid logging to stdout for model loading messages in mtmd-cli ( #17277 )
2025-11-15 12:41:16 +01:00
Xuan-Son Nguyen
9b17d74ab7
mtmd: add mtmd_log_set ( #17268 )
2025-11-14 15:56:19 +01:00
Georgi Gerganov
d396b43748
server : fix "can batch with" bug ( #17263 )
2025-11-14 14:03:45 +02:00
Aleksander Grygier
f1bad23f88
Better UX for handling multiple attachments in WebUI ( #17246 )
2025-11-14 01:19:08 +01:00
Xuan-Son Nguyen
c4abcb2457
server: fixing naming conflict res_error ( #17243 )
2025-11-13 20:53:47 +01:00
Aleksander Grygier
8e878f0cb4
Update packages + upgrade Storybook to v10 ( #17201 )
...
* chore: Update packages + upgrade Storybook to v10
* fix: Increase timeout for UI tests
2025-11-12 19:01:48 +01:00
Xuan-Son Nguyen
00c94083b3
server: (refactor) implement generator-based API for task results ( #17174 )
...
* server: (refactor) implement generator-based API for task results
* improve
* moving some code
* fix "Response ended prematurely"
* add sink.done before return false
* rm redundant check
* rm unused var
* rename generator --> reader
2025-11-12 18:50:52 +01:00
Xuan-Son Nguyen
ee8dd5c658
server: move res_error/res_ok to static function ( #17167 )
2025-11-12 14:17:24 +01:00
Adrien Gallouët
78010a0d52
cmake : move OpenSSL linking to vendor/cpp-httplib ( #17177 )
...
* cmake : move OpenSSL linking to vendor/cpp-httplib
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* bring back httplib 0.27.0
* add -DLLAMA_HTTPLIB
* update cmake config for visionos
---------
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2025-11-12 12:32:50 +01:00
Xuan-Son Nguyen
1d45b4228f
vendor: split httplib to cpp/h files ( #17150 )
...
* vendor: split httplib to cpp/h files
* move defines
* include httplib if curl is not used
* add TODO
* fix build ios
* fix build visionos instead
2025-11-11 13:32:58 +01:00
Mike Abbott
4a5b8aff40
cmake : add version to all shared object files ( #17091 )
...
When compiling llama.cpp in Yocto, it fails QA checks because the generated so files aren't versioned. This applies a version to all generated so files, allowing the package to build without errors.
2025-11-11 13:19:50 +02:00