Commit Graph

7334 Commits

Author SHA1 Message Date
Aleksander Grygier acd3c58152 refactor: Remove redundant method 2025-11-29 23:18:24 +01:00
Aleksander Grygier 360a5ed62b test: Move demo test to tests/server 2025-11-29 23:17:34 +01:00
Aleksander Grygier 6fd720e742 Merge remote-tracking branch 'origin/allozaur/server_model_management_v1_2' into allozaur/server_model_management_v1_2 2025-11-29 21:59:33 +01:00
Aleksander Grygier ae8a1e8137 refactor: Tests to separate location 2025-11-29 21:44:57 +01:00
Aleksander Grygier 949b5fd63e refactor: Tooltip Provider from core layout 2025-11-29 21:41:36 +01:00
Aleksander Grygier 4f39da823f test: Update Chat Form UI tests 2025-11-29 20:13:11 +01:00
Aleksander Grygier 33b9cc40a1
Merge branch 'master' into allozaur/server_model_management_v1_2 2025-11-29 19:40:46 +01:00
Aleksander Grygier a568e74c20 chore: update webui build output 2025-11-29 02:40:09 +01:00
Aleksander Grygier 2d556bb93c test: Fix Storybook mocks 2025-11-29 02:36:41 +01:00
Aleksander Grygier 493ef08723 refactor: Utils imports + move types to `app.d.ts` 2025-11-29 02:33:37 +01:00
Aleksander Grygier ce9c9afe0d chore: update webui build output 2025-11-29 01:40:00 +01:00
Aleksander Grygier 2464e06028 feat: Improve UI sidebar background color 2025-11-29 01:39:40 +01:00
Aleksander Grygier 27b152267f refactor: Constants 2025-11-29 01:38:02 +01:00
Aleksander Grygier 648d2deebc feat: Attachment logic & UI improvements 2025-11-29 01:36:05 +01:00
Aleksander Grygier d49d97c642
refactor: Cleanup 2025-11-29 00:51:18 +01:00
Aleksander Grygier f50ce7b5b4
refactor: Cleanup 2025-11-29 00:50:16 +01:00
Aleksander Grygier 4d16459b4c
re 2025-11-29 00:49:46 +01:00
Aleksander Grygier c76de5e0ad
refactor: Cleanup 2025-11-29 00:49:20 +01:00
Aleksander Grygier 2f97dbfa65
docs: Add info comment 2025-11-29 00:49:03 +01:00
Aleksander Grygier 1adf173dd6 refactor: Cleanup 2025-11-28 19:36:03 +01:00
Aleksander Grygier dd30810d0a fix: Modality detection improvement for text-based PDF attachments 2025-11-28 19:30:32 +01:00
Aleksander Grygier 171a0926a1 chore: update webui build output 2025-11-28 16:00:44 +01:00
Aleksander Grygier 68b653ef45 refactor: DRY `getAttachmentDisplayItems` function + fix UI 2025-11-28 15:58:52 +01:00
Aleksander Grygier 1cf5daa8c0 refactor: Cleanup 2025-11-28 15:56:41 +01:00
Aleksander Grygier 04ef4a06e2 chore: update webui build output 2025-11-28 15:44:43 +01:00
Aleksander Grygier 5fadd0fe18 refactor: Components naming 2025-11-28 15:39:47 +01:00
Aleksander Grygier 3470b12b76 chore: update webui build output 2025-11-28 15:09:55 +01:00
Aleksander Grygier eed1bd9b97 refactor: Enhance model info and attachment handling 2025-11-28 15:08:41 +01:00
Aleksander Grygier 491fe2d3f7 feat: Update logic for PDF as Image 2025-11-28 13:10:00 +01:00
Johannes Gäßler 73955f7d2a
CUDA: no FP16 arithmetic for vector FA kernel (#17558) 2025-11-28 10:29:09 +01:00
Jeff Bolz 35cf8887e1
vulkan: Implement GGML_OP_TRI (#17503)
* vulkan: Implement GGML_OP_TRI

* check types match
2025-11-28 10:07:29 +01:00
Radoslav Gerganov 15d2b46b4d
rpc : cache and reuse compute graphs (#15405)
Store the last computed graph and reuse it when possible.
Also do not return response from GRAPH_COMPUTE and assume it always
completes successfully. If this this is not the case, the server closes
the connection. This saves us a network round trip to the server.
2025-11-28 08:33:51 +00:00
yulo 6bca76ff5e
HIP: enable mul_mat_f for RDNA4 (#17437)
* enable mmf for rdna4

* move some mmvf to mmf

* revert lds128 for wmma loading

* Revert "revert lds128 for wmma loading"

This reverts commit db9ae8b6b4.

* Revert "enable mmf for rdna4"

This reverts commit 698c9f2418.

* Revert "move some mmvf to mmf"

This reverts commit 99b92bd665.

* enable mul_mat for rdna4

---------

Co-authored-by: zhang hui <you@example.com>
2025-11-28 08:24:30 +01:00
Piotr Wilkin (ilintar) cd0e3a7a3b
SOLVE_TRI CUDA kernel for small matrices (#17457) 2025-11-28 12:15:32 +08:00
Neo Zhang Jianyu efaaccdd69
refactor pad_reflect_1d to make the UT case pass (#17204)
Co-authored-by: Zhang Jianyu <zhang.jianyu@outlook.com>
2025-11-28 08:50:56 +08:00
Aleksander Grygier bc577266b9 docs: Architecture documentation 2025-11-27 22:04:20 +01:00
Aleksander Grygier db479523ec feat: Condition available models based on modality + better model loading strategy & UX 2025-11-27 19:13:05 +01:00
Jeff Bolz 4abef75f2c
vulkan: Implement SOLVE_TRI (#17486)
* vulkan: Implement SOLVE_TRI

* load B matrix through shared memory

* use FLOAT_TYPE
2025-11-27 15:48:00 +01:00
Georgi Gerganov c386114922
arch : add description about LLM_TENSOR_INFOS (#17550) 2025-11-27 16:34:13 +02:00
Georgi Gerganov 6783b11fb0
models : fix LFM2 tensors (#17548) 2025-11-27 16:04:29 +02:00
Aleksander Grygier 9086bc30bd feat: Improve statistic badges 2025-11-27 14:12:21 +01:00
Aleksander Grygier d73353732f refactor: Architecture cleanup 2025-11-27 14:03:25 +01:00
Aleksander Grygier 78ead49830 Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2' into allozaur/server_model_management_v1_2 2025-11-27 13:48:21 +01:00
Aleksander Grygier 6a3d6e79d2 refactor: Services/Stores syntax + logic improvements
Refactors components to access stores directly instead of using exported getter functions.

This change centralizes store access and logic, simplifying component code and improving maintainability by reducing the number of exported functions and promoting direct store interaction.

Removes exported getter functions from `chat.svelte.ts`, `conversations.svelte.ts`, `models.svelte.ts` and `settings.svelte.ts`.
2025-11-27 13:44:49 +01:00
matt23654 909072abcf
cuda : fix UMA detection on discrete GPUs. (#17537) 2025-11-27 13:35:35 +02:00
Alberto Cabrera Pérez cd8370b408
ggml-cpu: aarm64: q4_K repack gemm and gemv implementations (dotprod only) (#17494)
* Enabled q4_K_4x8 path

* Fixed generic Q4_K 8x4 implementation

* wip: dotprod gemm

* Working arm q4_K dotprod gemm

Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>

* Undo acc rename

Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>

* Q4_K arm dotprod gemm

Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>

* Fix: q4_qs reinterpret from uint to int

Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>

* Removed comments

* Fixed macro guards

* Fixed unused vars in generic implementation

* Fixed unused vars in 8x4 repack

* Fixed unused vars in generic implementation, unneeded comment

* Missing arch fallback for x86

* minor : style

---------

Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-11-27 13:25:14 +02:00
Eric Curtin d21a76ac38
devops: Add build-essential to Ubuntu 26.04 image (#17531)
This is no longer passing the build, needs more packages.

Signed-off-by: Eric Curtin <eric.curtin@docker.com>
2025-11-27 18:35:47 +08:00
Aleksei Nikiforov 4fcd87cf7c
gguf-py : skip endian-conversion of MXFP4 data (#17523)
* gguf_convert_endian.py: skip MXFP4 data

* Use gguf.constants.GGML_QUANT_SIZES to determine block sizes
2025-11-27 11:35:38 +01:00
Aleksander Grygier 69065ddc56 fix: UI 2025-11-27 11:27:58 +01:00
Aleksander Grygier 6b95118abc refactor: Processing state reactivity 2025-11-27 11:11:45 +01:00