Aleksander Grygier
48dbef1729
chore: update webui build output
2025-11-23 21:58:38 +01:00
Aleksander Grygier
b7ba13b6a0
refactor: Attachments data
2025-11-23 21:46:43 +01:00
Aleksander Grygier
1f0cb3ab26
feat: Use `model` property for displaying the `repo/model-name` naming format
2025-11-23 21:19:00 +01:00
Xuan Son Nguyen
d65be9170b
address review comments
2025-11-23 19:31:21 +01:00
Xuan Son Nguyen
5ad594e6d6
cleaner
2025-11-23 19:02:07 +01:00
Pascal
0c7220db56
webui: minor settings reorganization and add disable autoscroll option ( #17452 )
...
* webui: added a dedicated 'Display' settings section that groups visualization options
* webui: added a Display setting to toggle automatic chat scrolling
* chore: update webui build output
2025-11-23 18:42:00 +01:00
Xuan Son Nguyen
2e355c7f8e
oai-compat /models endpoint
2025-11-23 17:25:24 +01:00
Xuan Son Nguyen
f95f9c5128
typo docs
2025-11-23 16:14:02 +01:00
Xuan Son Nguyen
74685f4194
allow reusing args if auto_load
2025-11-23 15:42:33 +01:00
Xuan Son Nguyen
f927e21ffc
support extra_args on loading model
2025-11-23 15:39:03 +01:00
Xuan Son Nguyen
7ef6312f85
add note
2025-11-23 15:08:31 +01:00
Xuan Son Nguyen
f25bfaba4d
expose args and exit_code in API
2025-11-23 14:59:04 +01:00
Sigbjørn Skjæret
96ac5a2329
cuda : support non-contiguous i32 to i32 copy ( #17326 )
...
* support non-contiguous i32 to i32 copy
* add tests
* rename cpy_flt to cpy_scalar and reindent params
2025-11-23 11:13:34 +01:00
Eric Curtin
bc809e9c53
vulkan: Update docker image to Ubuntu 26.04 to enable glslc features ( #17439 )
...
26.04 provides these
Signed-off-by: Eric Curtin <eric.curtin@docker.com>
2025-11-23 10:29:36 +01:00
Jeff Bolz
54d83bbe85
vulkan: remove a couple unnecessary switches ( #17419 )
2025-11-23 06:29:40 +01:00
Aleksander Grygier
6282537a8b
Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2' into allozaur/server_model_management_v1_2
2025-11-22 23:35:05 +01:00
Adrien Gallouët
4949ac0f18
ci : switch to BoringSSL on Server workflow ( #17441 )
...
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2025-11-22 21:38:19 +01:00
Aleksander Grygier
036cc939f8
chore: update webui build output
2025-11-22 19:37:43 +01:00
Aleksander Grygier
a39ef24c91
feat: Auto-select model from last assistant response
2025-11-22 19:18:32 +01:00
Aleksander Grygier
dc913ec424
feat: Chat Form Actions UI logic improvements
2025-11-22 19:06:17 +01:00
Aleksander Grygier
db8ed5df9c
feat: Model unavailable UI state for model selector
2025-11-22 19:02:50 +01:00
Aleksander Grygier
076eec6d60
feat: Add copy to clipboard to model name in model info dialog
2025-11-22 19:00:05 +01:00
Xuan Son Nguyen
4af1b6cbac
Merge remote-tracking branch 'webui/allozaur/server_model_management_v1_2' into xsn/server_model_maagement_v1_2
...
Co-authored-by: Aleksander <aleksander.grygier@gmail.com>
2025-11-22 18:39:31 +01:00
Xuan Son Nguyen
d32bbfec82
ad endpoint docs
2025-11-22 18:01:48 +01:00
Xuan Son Nguyen
f2ca54b202
Merge branch 'master' into xsn/server_model_management_v1_2
2025-11-22 13:21:13 +01:00
Masato Nakasaka
3f3a4fb9c3
Revive MUL_MAT_ID to perf testing ( #17397 )
2025-11-22 10:55:43 +01:00
Aleksander Grygier
c274f132cb
refactor: Chat Form Submit component
2025-11-22 01:35:02 +01:00
yulo
028f93ef98
HIP: RDNA4 tensor core support for MMF ( #17077 )
...
* mmf for rdna4
* align the padding for rdna4
* forbit mul_mat_f for rdna4
* fix as comment
* remove device kernels
* add constexpr for early return
* update based on review comment
* change based on the review comment
* pass compile error
* keep code consistency
---------
Co-authored-by: zhang hui <you@example.com>
2025-11-22 00:03:24 +01:00
lhez
8e9ddba610
opencl: refine condition for kqv mm ( #17392 )
2025-11-21 14:34:48 -08:00
Xuan Son Nguyen
457fbdac2c
fix compile
2025-11-21 23:26:32 +01:00
Xuan Son Nguyen
525e2746df
address review comments
2025-11-21 23:25:34 +01:00
Xuan Son Nguyen
b0540e8e1e
add env for args
2025-11-21 23:06:49 +01:00
Xuan Son Nguyen
7241558835
better --models-dir
2025-11-21 23:06:09 +01:00
Xuan Son Nguyen
7cd929076d
remove default model path
2025-11-21 22:33:04 +01:00
Xuan Son Nguyen
62ee883d5a
implement LRU
2025-11-21 22:22:57 +01:00
Aleksander Grygier
92585c7173
feat: Attachments UX improvements
2025-11-21 21:23:20 +01:00
Aleksander Grygier
69503aa519
feat: Add auto-mic setting
2025-11-21 21:18:13 +01:00
ubergarm
23bc779a6e
model : detect GigaChat3-10-A1.8B as deepseek lite ( #17420 )
...
* Detect GigaChat3-10-A1.8B as deepseek lite
Hardcodes checking number of layers to detect if lite version of deepseek.
* Add commnent identifying deepseek lite variants
deepseek lite variants include DeepSeek-V2-Lite, GigaChat3-10B-A1.8B
2025-11-21 14:51:38 +01:00
Aleksander Grygier
6b7c0a5090
chore: update webui build output
2025-11-21 14:27:45 +01:00
Aleksander Grygier
8b1d96755e
feat: New Model Selection UX WIP
2025-11-21 14:26:50 +01:00
Adrien Gallouët
28175f857d
cmake : add option to build and link BoringSSL ( #17205 )
...
* cmake: add option to build and link BoringSSL
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* cmake : fix typo
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* cmake : disable boringssl test and asm by default
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* cmake : skip bssl
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* cmake : disable fips
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* cmake : fix cmake --install
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* ci : use boringssl for windows and mac
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
---------
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2025-11-21 11:46:45 +01:00
Adrien Gallouët
9cc4080441
ci : start using OpenSSL ( #17235 )
...
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2025-11-21 11:45:00 +01:00
Xuan Son Nguyen
032b9ff4a9
add --models-dir param
2025-11-21 11:11:01 +01:00
Aleksander Grygier
c26c3402fe
chore: update webui build output
2025-11-21 11:10:07 +01:00
Aleksander Grygier
049f40dfdf
refactor: Use only the message data `model` property for displaying model used info
2025-11-21 11:00:49 +01:00
Jeff Bolz
f1ffbba68e
vulkan: disable async for older Intel devices ( #17369 )
...
* vulkan: disable async for older Intel devices
* update detection logic
* use name string for detection
2025-11-21 09:58:17 +01:00
Aleksander Grygier
45bf2a4983
Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2' into allozaur/server_model_management_v1_2
2025-11-21 09:25:17 +01:00
Raul Torres
2370665e56
CANN: Refactor `evaluate_and_capture_cann_graph` ( #17333 )
...
* CANN: Refactor `evaluate_and_capture_cann_graph`
**Description of the problem**
* `matched_graph` is obtained even if graph mode is disabled.
* End of graph capture and graph replay are unnecessarily placed in different `if` blocks.
**Proposed solution**
* Obtain `matched_graph` only if graph mode is enabled.
* Place end of graph capture and graph reply inside the same `if` block.
* Unify graph related comments.
* Remove trailing whitespace
2025-11-21 16:23:29 +08:00
nullname
21d31e0810
ggml-hexagon: fix swiglu failure at `test-backend-ops` ( #17344 )
...
* refactor: use hvx_vec_exp_fp32_guard_inf for overflow handling in hvx_exp_f32
* feat: add fast sigmoid function with overflow guard for fp32
* refactor: replace hvx_vec_inverse_fp32 with hvx_vec_inverse_fp32_guard_inf for improved overflow handling
* feat: enhance hvx_add_scalar_f32 with overflow handling using infinity guard
* wip
* add HVX_Vector_Alias
wip
* wip
* fix: improve handling of src1 tensor in glu_swiglu_fp32_per_thread function
* fix nc
* wip
* wip
* handle nan at inverse
* wip
* fix neg
* wip
* rename
* fix hvx_vec_inverse_fp32_guard_inf to handle infinity and NaN cases correctly
* wip
* fix hvx_vec_inverse_fp32_guard_inf to handle NaN cases correctly
* wip
* wip
* wip
* fix output sign
2025-11-20 15:45:05 -08:00
Aleksander Grygier
cc88f6a75b
chore: update webui build output
2025-11-21 00:08:09 +01:00