Commit Graph

7243 Commits

Author SHA1 Message Date
Aleksander Grygier 48dbef1729 chore: update webui build output 2025-11-23 21:58:38 +01:00
Aleksander Grygier b7ba13b6a0 refactor: Attachments data 2025-11-23 21:46:43 +01:00
Aleksander Grygier 1f0cb3ab26 feat: Use `model` property for displaying the `repo/model-name` naming format 2025-11-23 21:19:00 +01:00
Xuan Son Nguyen d65be9170b address review comments 2025-11-23 19:31:21 +01:00
Xuan Son Nguyen 5ad594e6d6 cleaner 2025-11-23 19:02:07 +01:00
Pascal 0c7220db56
webui: minor settings reorganization and add disable autoscroll option (#17452)
* webui: added a dedicated 'Display' settings section that groups visualization options

* webui: added a Display setting to toggle automatic chat scrolling

* chore: update webui build output
2025-11-23 18:42:00 +01:00
Xuan Son Nguyen 2e355c7f8e oai-compat /models endpoint 2025-11-23 17:25:24 +01:00
Xuan Son Nguyen f95f9c5128 typo docs 2025-11-23 16:14:02 +01:00
Xuan Son Nguyen 74685f4194 allow reusing args if auto_load 2025-11-23 15:42:33 +01:00
Xuan Son Nguyen f927e21ffc support extra_args on loading model 2025-11-23 15:39:03 +01:00
Xuan Son Nguyen 7ef6312f85 add note 2025-11-23 15:08:31 +01:00
Xuan Son Nguyen f25bfaba4d expose args and exit_code in API 2025-11-23 14:59:04 +01:00
Sigbjørn Skjæret 96ac5a2329
cuda : support non-contiguous i32 to i32 copy (#17326)
* support non-contiguous i32 to i32 copy

* add tests

* rename cpy_flt to cpy_scalar and reindent params
2025-11-23 11:13:34 +01:00
Eric Curtin bc809e9c53
vulkan: Update docker image to Ubuntu 26.04 to enable glslc features (#17439)
26.04 provides these

Signed-off-by: Eric Curtin <eric.curtin@docker.com>
2025-11-23 10:29:36 +01:00
Jeff Bolz 54d83bbe85
vulkan: remove a couple unnecessary switches (#17419) 2025-11-23 06:29:40 +01:00
Aleksander Grygier 6282537a8b Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2' into allozaur/server_model_management_v1_2 2025-11-22 23:35:05 +01:00
Adrien Gallouët 4949ac0f18
ci : switch to BoringSSL on Server workflow (#17441)
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2025-11-22 21:38:19 +01:00
Aleksander Grygier 036cc939f8 chore: update webui build output 2025-11-22 19:37:43 +01:00
Aleksander Grygier a39ef24c91 feat: Auto-select model from last assistant response 2025-11-22 19:18:32 +01:00
Aleksander Grygier dc913ec424 feat: Chat Form Actions UI logic improvements 2025-11-22 19:06:17 +01:00
Aleksander Grygier db8ed5df9c feat: Model unavailable UI state for model selector 2025-11-22 19:02:50 +01:00
Aleksander Grygier 076eec6d60 feat: Add copy to clipboard to model name in model info dialog 2025-11-22 19:00:05 +01:00
Xuan Son Nguyen 4af1b6cbac Merge remote-tracking branch 'webui/allozaur/server_model_management_v1_2' into xsn/server_model_maagement_v1_2
Co-authored-by: Aleksander <aleksander.grygier@gmail.com>
2025-11-22 18:39:31 +01:00
Xuan Son Nguyen d32bbfec82 ad endpoint docs 2025-11-22 18:01:48 +01:00
Xuan Son Nguyen f2ca54b202 Merge branch 'master' into xsn/server_model_management_v1_2 2025-11-22 13:21:13 +01:00
Masato Nakasaka 3f3a4fb9c3
Revive MUL_MAT_ID to perf testing (#17397) 2025-11-22 10:55:43 +01:00
Aleksander Grygier c274f132cb refactor: Chat Form Submit component 2025-11-22 01:35:02 +01:00
yulo 028f93ef98
HIP: RDNA4 tensor core support for MMF (#17077)
* mmf for rdna4

* align the padding for rdna4

* forbit mul_mat_f for rdna4

* fix as comment

* remove device kernels

* add constexpr for early return

* update based on review comment

* change based on the review comment

* pass compile error

* keep code consistency

---------

Co-authored-by: zhang hui <you@example.com>
2025-11-22 00:03:24 +01:00
lhez 8e9ddba610
opencl: refine condition for kqv mm (#17392) 2025-11-21 14:34:48 -08:00
Xuan Son Nguyen 457fbdac2c fix compile 2025-11-21 23:26:32 +01:00
Xuan Son Nguyen 525e2746df address review comments 2025-11-21 23:25:34 +01:00
Xuan Son Nguyen b0540e8e1e add env for args 2025-11-21 23:06:49 +01:00
Xuan Son Nguyen 7241558835 better --models-dir 2025-11-21 23:06:09 +01:00
Xuan Son Nguyen 7cd929076d remove default model path 2025-11-21 22:33:04 +01:00
Xuan Son Nguyen 62ee883d5a implement LRU 2025-11-21 22:22:57 +01:00
Aleksander Grygier 92585c7173 feat: Attachments UX improvements 2025-11-21 21:23:20 +01:00
Aleksander Grygier 69503aa519 feat: Add auto-mic setting 2025-11-21 21:18:13 +01:00
ubergarm 23bc779a6e
model : detect GigaChat3-10-A1.8B as deepseek lite (#17420)
* Detect GigaChat3-10-A1.8B as deepseek lite

Hardcodes checking number of layers to detect if lite version of deepseek.

* Add commnent identifying deepseek lite variants

deepseek lite variants include DeepSeek-V2-Lite, GigaChat3-10B-A1.8B
2025-11-21 14:51:38 +01:00
Aleksander Grygier 6b7c0a5090 chore: update webui build output 2025-11-21 14:27:45 +01:00
Aleksander Grygier 8b1d96755e feat: New Model Selection UX WIP 2025-11-21 14:26:50 +01:00
Adrien Gallouët 28175f857d
cmake : add option to build and link BoringSSL (#17205)
* cmake: add option to build and link BoringSSL

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* cmake : fix typo

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* cmake : disable boringssl test and asm by default

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* cmake : skip bssl

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* cmake : disable fips

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* cmake : fix cmake --install

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* ci : use boringssl for windows and mac

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

---------

Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2025-11-21 11:46:45 +01:00
Adrien Gallouët 9cc4080441
ci : start using OpenSSL (#17235)
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2025-11-21 11:45:00 +01:00
Xuan Son Nguyen 032b9ff4a9 add --models-dir param 2025-11-21 11:11:01 +01:00
Aleksander Grygier c26c3402fe chore: update webui build output 2025-11-21 11:10:07 +01:00
Aleksander Grygier 049f40dfdf refactor: Use only the message data `model` property for displaying model used info 2025-11-21 11:00:49 +01:00
Jeff Bolz f1ffbba68e
vulkan: disable async for older Intel devices (#17369)
* vulkan: disable async for older Intel devices

* update detection logic

* use name string for detection
2025-11-21 09:58:17 +01:00
Aleksander Grygier 45bf2a4983 Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2' into allozaur/server_model_management_v1_2 2025-11-21 09:25:17 +01:00
Raul Torres 2370665e56
CANN: Refactor `evaluate_and_capture_cann_graph` (#17333)
* CANN: Refactor `evaluate_and_capture_cann_graph`

**Description of the problem**

* `matched_graph` is obtained even if graph mode is disabled.
* End of graph capture and graph replay are unnecessarily placed in different `if` blocks.

**Proposed solution**

* Obtain `matched_graph` only if graph mode is enabled.
* Place end of graph capture and graph reply inside the same `if` block.
* Unify graph related comments.

* Remove trailing whitespace
2025-11-21 16:23:29 +08:00
nullname 21d31e0810
ggml-hexagon: fix swiglu failure at `test-backend-ops` (#17344)
* refactor: use hvx_vec_exp_fp32_guard_inf for overflow handling in hvx_exp_f32

* feat: add fast sigmoid function with overflow guard for fp32

* refactor: replace hvx_vec_inverse_fp32 with hvx_vec_inverse_fp32_guard_inf for improved overflow handling

* feat: enhance hvx_add_scalar_f32 with overflow handling using infinity guard

* wip

* add HVX_Vector_Alias

wip

* wip

* fix: improve handling of src1 tensor in glu_swiglu_fp32_per_thread function

* fix nc

* wip

* wip

* handle nan at inverse

* wip

* fix neg

* wip

* rename

* fix hvx_vec_inverse_fp32_guard_inf to handle infinity and NaN cases correctly

* wip

* fix hvx_vec_inverse_fp32_guard_inf to handle NaN cases correctly

* wip

* wip

* wip

* fix output sign
2025-11-20 15:45:05 -08:00
Aleksander Grygier cc88f6a75b chore: update webui build output 2025-11-21 00:08:09 +01:00