Commit Graph

7284 Commits

Author SHA1 Message Date
Xuan Son Nguyen f927e21ffc support extra_args on loading model 2025-11-23 15:39:03 +01:00
Xuan Son Nguyen 7ef6312f85 add note 2025-11-23 15:08:31 +01:00
Xuan Son Nguyen f25bfaba4d expose args and exit_code in API 2025-11-23 14:59:04 +01:00
Sigbjørn Skjæret 96ac5a2329
cuda : support non-contiguous i32 to i32 copy (#17326)
* support non-contiguous i32 to i32 copy

* add tests

* rename cpy_flt to cpy_scalar and reindent params
2025-11-23 11:13:34 +01:00
Eric Curtin bc809e9c53
vulkan: Update docker image to Ubuntu 26.04 to enable glslc features (#17439)
26.04 provides these

Signed-off-by: Eric Curtin <eric.curtin@docker.com>
2025-11-23 10:29:36 +01:00
Jeff Bolz 54d83bbe85
vulkan: remove a couple unnecessary switches (#17419) 2025-11-23 06:29:40 +01:00
Aleksander Grygier 6282537a8b Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2' into allozaur/server_model_management_v1_2 2025-11-22 23:35:05 +01:00
Adrien Gallouët 4949ac0f18
ci : switch to BoringSSL on Server workflow (#17441)
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2025-11-22 21:38:19 +01:00
Aleksander Grygier 036cc939f8 chore: update webui build output 2025-11-22 19:37:43 +01:00
Aleksander Grygier a39ef24c91 feat: Auto-select model from last assistant response 2025-11-22 19:18:32 +01:00
Aleksander Grygier dc913ec424 feat: Chat Form Actions UI logic improvements 2025-11-22 19:06:17 +01:00
Aleksander Grygier db8ed5df9c feat: Model unavailable UI state for model selector 2025-11-22 19:02:50 +01:00
Aleksander Grygier 076eec6d60 feat: Add copy to clipboard to model name in model info dialog 2025-11-22 19:00:05 +01:00
Xuan Son Nguyen 4af1b6cbac Merge remote-tracking branch 'webui/allozaur/server_model_management_v1_2' into xsn/server_model_maagement_v1_2
Co-authored-by: Aleksander <aleksander.grygier@gmail.com>
2025-11-22 18:39:31 +01:00
Xuan Son Nguyen d32bbfec82 ad endpoint docs 2025-11-22 18:01:48 +01:00
Xuan Son Nguyen f2ca54b202 Merge branch 'master' into xsn/server_model_management_v1_2 2025-11-22 13:21:13 +01:00
Masato Nakasaka 3f3a4fb9c3
Revive MUL_MAT_ID to perf testing (#17397) 2025-11-22 10:55:43 +01:00
Aleksander Grygier c274f132cb refactor: Chat Form Submit component 2025-11-22 01:35:02 +01:00
yulo 028f93ef98
HIP: RDNA4 tensor core support for MMF (#17077)
* mmf for rdna4

* align the padding for rdna4

* forbit mul_mat_f for rdna4

* fix as comment

* remove device kernels

* add constexpr for early return

* update based on review comment

* change based on the review comment

* pass compile error

* keep code consistency

---------

Co-authored-by: zhang hui <you@example.com>
2025-11-22 00:03:24 +01:00
lhez 8e9ddba610
opencl: refine condition for kqv mm (#17392) 2025-11-21 14:34:48 -08:00
Xuan Son Nguyen 457fbdac2c fix compile 2025-11-21 23:26:32 +01:00
Xuan Son Nguyen 525e2746df address review comments 2025-11-21 23:25:34 +01:00
Xuan Son Nguyen b0540e8e1e add env for args 2025-11-21 23:06:49 +01:00
Xuan Son Nguyen 7241558835 better --models-dir 2025-11-21 23:06:09 +01:00
Xuan Son Nguyen 7cd929076d remove default model path 2025-11-21 22:33:04 +01:00
Xuan Son Nguyen 62ee883d5a implement LRU 2025-11-21 22:22:57 +01:00
Aleksander Grygier 92585c7173 feat: Attachments UX improvements 2025-11-21 21:23:20 +01:00
Aleksander Grygier 69503aa519 feat: Add auto-mic setting 2025-11-21 21:18:13 +01:00
ubergarm 23bc779a6e
model : detect GigaChat3-10-A1.8B as deepseek lite (#17420)
* Detect GigaChat3-10-A1.8B as deepseek lite

Hardcodes checking number of layers to detect if lite version of deepseek.

* Add commnent identifying deepseek lite variants

deepseek lite variants include DeepSeek-V2-Lite, GigaChat3-10B-A1.8B
2025-11-21 14:51:38 +01:00
Aleksander Grygier 6b7c0a5090 chore: update webui build output 2025-11-21 14:27:45 +01:00
Aleksander Grygier 8b1d96755e feat: New Model Selection UX WIP 2025-11-21 14:26:50 +01:00
Adrien Gallouët 28175f857d
cmake : add option to build and link BoringSSL (#17205)
* cmake: add option to build and link BoringSSL

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* cmake : fix typo

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* cmake : disable boringssl test and asm by default

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* cmake : skip bssl

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* cmake : disable fips

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* cmake : fix cmake --install

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* ci : use boringssl for windows and mac

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

---------

Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2025-11-21 11:46:45 +01:00
Adrien Gallouët 9cc4080441
ci : start using OpenSSL (#17235)
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2025-11-21 11:45:00 +01:00
Xuan Son Nguyen 032b9ff4a9 add --models-dir param 2025-11-21 11:11:01 +01:00
Aleksander Grygier c26c3402fe chore: update webui build output 2025-11-21 11:10:07 +01:00
Aleksander Grygier 049f40dfdf refactor: Use only the message data `model` property for displaying model used info 2025-11-21 11:00:49 +01:00
Jeff Bolz f1ffbba68e
vulkan: disable async for older Intel devices (#17369)
* vulkan: disable async for older Intel devices

* update detection logic

* use name string for detection
2025-11-21 09:58:17 +01:00
Aleksander Grygier 45bf2a4983 Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2' into allozaur/server_model_management_v1_2 2025-11-21 09:25:17 +01:00
Raul Torres 2370665e56
CANN: Refactor `evaluate_and_capture_cann_graph` (#17333)
* CANN: Refactor `evaluate_and_capture_cann_graph`

**Description of the problem**

* `matched_graph` is obtained even if graph mode is disabled.
* End of graph capture and graph replay are unnecessarily placed in different `if` blocks.

**Proposed solution**

* Obtain `matched_graph` only if graph mode is enabled.
* Place end of graph capture and graph reply inside the same `if` block.
* Unify graph related comments.

* Remove trailing whitespace
2025-11-21 16:23:29 +08:00
nullname 21d31e0810
ggml-hexagon: fix swiglu failure at `test-backend-ops` (#17344)
* refactor: use hvx_vec_exp_fp32_guard_inf for overflow handling in hvx_exp_f32

* feat: add fast sigmoid function with overflow guard for fp32

* refactor: replace hvx_vec_inverse_fp32 with hvx_vec_inverse_fp32_guard_inf for improved overflow handling

* feat: enhance hvx_add_scalar_f32 with overflow handling using infinity guard

* wip

* add HVX_Vector_Alias

wip

* wip

* fix: improve handling of src1 tensor in glu_swiglu_fp32_per_thread function

* fix nc

* wip

* wip

* handle nan at inverse

* wip

* fix neg

* wip

* rename

* fix hvx_vec_inverse_fp32_guard_inf to handle infinity and NaN cases correctly

* wip

* fix hvx_vec_inverse_fp32_guard_inf to handle NaN cases correctly

* wip

* wip

* wip

* fix output sign
2025-11-20 15:45:05 -08:00
Aleksander Grygier cc88f6a75b chore: update webui build output 2025-11-21 00:08:09 +01:00
Aleksander Grygier 4bf82a10f1 feat: Improved UX for model information, modality interactions etc 2025-11-21 00:05:43 +01:00
Xuan Son Nguyen a2e912cf35 address review comment 2025-11-20 21:54:22 +01:00
Xuan Son Nguyen cd5c699304 add docs (first version) 2025-11-20 21:45:05 +01:00
Xuan Son Nguyen be25bccdff address review comment 2025-11-20 21:37:22 +01:00
Daniel Han dd0f321941
readme : add Unsloth exporting to GGUF in tools (#17411) 2025-11-20 20:07:36 +01:00
Xuan Son Nguyen 6929c9f43d address thread safety issue 2025-11-20 18:38:02 +01:00
Xuan-Son Nguyen 054a45c3d3
grammar: fix regression caused by #17381 (#17412)
* grammar: fix regression caused by #17381

* more readable
2025-11-20 18:35:10 +01:00
Xuan Son Nguyen 5369aaa1d6 address most problems 2025-11-20 18:34:22 +01:00
Xuan Son Nguyen 216140867e tmp apply upstream fix 2025-11-20 18:19:21 +01:00