Aleksander Grygier
6fd720e742
Merge remote-tracking branch 'origin/allozaur/server_model_management_v1_2' into allozaur/server_model_management_v1_2
2025-11-29 21:59:33 +01:00
Aleksander Grygier
ae8a1e8137
refactor: Tests to separate location
2025-11-29 21:44:57 +01:00
Aleksander Grygier
949b5fd63e
refactor: Tooltip Provider from core layout
2025-11-29 21:41:36 +01:00
Aleksander Grygier
4f39da823f
test: Update Chat Form UI tests
2025-11-29 20:13:11 +01:00
Aleksander Grygier
33b9cc40a1
Merge branch 'master' into allozaur/server_model_management_v1_2
2025-11-29 19:40:46 +01:00
Aleksander Grygier
a568e74c20
chore: update webui build output
2025-11-29 02:40:09 +01:00
Aleksander Grygier
2d556bb93c
test: Fix Storybook mocks
2025-11-29 02:36:41 +01:00
Aleksander Grygier
493ef08723
refactor: Utils imports + move types to `app.d.ts`
2025-11-29 02:33:37 +01:00
Aleksander Grygier
ce9c9afe0d
chore: update webui build output
2025-11-29 01:40:00 +01:00
Aleksander Grygier
2464e06028
feat: Improve UI sidebar background color
2025-11-29 01:39:40 +01:00
Aleksander Grygier
27b152267f
refactor: Constants
2025-11-29 01:38:02 +01:00
Aleksander Grygier
648d2deebc
feat: Attachment logic & UI improvements
2025-11-29 01:36:05 +01:00
Aleksander Grygier
d49d97c642
refactor: Cleanup
2025-11-29 00:51:18 +01:00
Aleksander Grygier
f50ce7b5b4
refactor: Cleanup
2025-11-29 00:50:16 +01:00
Aleksander Grygier
4d16459b4c
re
2025-11-29 00:49:46 +01:00
Aleksander Grygier
c76de5e0ad
refactor: Cleanup
2025-11-29 00:49:20 +01:00
Aleksander Grygier
2f97dbfa65
docs: Add info comment
2025-11-29 00:49:03 +01:00
Aleksander Grygier
1adf173dd6
refactor: Cleanup
2025-11-28 19:36:03 +01:00
Aleksander Grygier
dd30810d0a
fix: Modality detection improvement for text-based PDF attachments
2025-11-28 19:30:32 +01:00
Aleksander Grygier
171a0926a1
chore: update webui build output
2025-11-28 16:00:44 +01:00
Aleksander Grygier
68b653ef45
refactor: DRY `getAttachmentDisplayItems` function + fix UI
2025-11-28 15:58:52 +01:00
Aleksander Grygier
1cf5daa8c0
refactor: Cleanup
2025-11-28 15:56:41 +01:00
Aleksander Grygier
04ef4a06e2
chore: update webui build output
2025-11-28 15:44:43 +01:00
Aleksander Grygier
5fadd0fe18
refactor: Components naming
2025-11-28 15:39:47 +01:00
Aleksander Grygier
3470b12b76
chore: update webui build output
2025-11-28 15:09:55 +01:00
Aleksander Grygier
eed1bd9b97
refactor: Enhance model info and attachment handling
2025-11-28 15:08:41 +01:00
Aleksander Grygier
491fe2d3f7
feat: Update logic for PDF as Image
2025-11-28 13:10:00 +01:00
Johannes Gäßler
73955f7d2a
CUDA: no FP16 arithmetic for vector FA kernel ( #17558 )
2025-11-28 10:29:09 +01:00
Jeff Bolz
35cf8887e1
vulkan: Implement GGML_OP_TRI ( #17503 )
...
* vulkan: Implement GGML_OP_TRI
* check types match
2025-11-28 10:07:29 +01:00
Radoslav Gerganov
15d2b46b4d
rpc : cache and reuse compute graphs ( #15405 )
...
Store the last computed graph and reuse it when possible.
Also do not return response from GRAPH_COMPUTE and assume it always
completes successfully. If this this is not the case, the server closes
the connection. This saves us a network round trip to the server.
2025-11-28 08:33:51 +00:00
yulo
6bca76ff5e
HIP: enable mul_mat_f for RDNA4 ( #17437 )
...
* enable mmf for rdna4
* move some mmvf to mmf
* revert lds128 for wmma loading
* Revert "revert lds128 for wmma loading"
This reverts commit db9ae8b6b4 .
* Revert "enable mmf for rdna4"
This reverts commit 698c9f2418 .
* Revert "move some mmvf to mmf"
This reverts commit 99b92bd665 .
* enable mul_mat for rdna4
---------
Co-authored-by: zhang hui <you@example.com>
2025-11-28 08:24:30 +01:00
Piotr Wilkin (ilintar)
cd0e3a7a3b
SOLVE_TRI CUDA kernel for small matrices ( #17457 )
2025-11-28 12:15:32 +08:00
Neo Zhang Jianyu
efaaccdd69
refactor pad_reflect_1d to make the UT case pass ( #17204 )
...
Co-authored-by: Zhang Jianyu <zhang.jianyu@outlook.com>
2025-11-28 08:50:56 +08:00
Aleksander Grygier
bc577266b9
docs: Architecture documentation
2025-11-27 22:04:20 +01:00
Aleksander Grygier
db479523ec
feat: Condition available models based on modality + better model loading strategy & UX
2025-11-27 19:13:05 +01:00
Jeff Bolz
4abef75f2c
vulkan: Implement SOLVE_TRI ( #17486 )
...
* vulkan: Implement SOLVE_TRI
* load B matrix through shared memory
* use FLOAT_TYPE
2025-11-27 15:48:00 +01:00
Georgi Gerganov
c386114922
arch : add description about LLM_TENSOR_INFOS ( #17550 )
2025-11-27 16:34:13 +02:00
Georgi Gerganov
6783b11fb0
models : fix LFM2 tensors ( #17548 )
2025-11-27 16:04:29 +02:00
Aleksander Grygier
9086bc30bd
feat: Improve statistic badges
2025-11-27 14:12:21 +01:00
Aleksander Grygier
d73353732f
refactor: Architecture cleanup
2025-11-27 14:03:25 +01:00
Aleksander Grygier
78ead49830
Merge remote-tracking branch 'ngxson/xsn/server_model_management_v1_2' into allozaur/server_model_management_v1_2
2025-11-27 13:48:21 +01:00
Aleksander Grygier
6a3d6e79d2
refactor: Services/Stores syntax + logic improvements
...
Refactors components to access stores directly instead of using exported getter functions.
This change centralizes store access and logic, simplifying component code and improving maintainability by reducing the number of exported functions and promoting direct store interaction.
Removes exported getter functions from `chat.svelte.ts`, `conversations.svelte.ts`, `models.svelte.ts` and `settings.svelte.ts`.
2025-11-27 13:44:49 +01:00
matt23654
909072abcf
cuda : fix UMA detection on discrete GPUs. ( #17537 )
2025-11-27 13:35:35 +02:00
Alberto Cabrera Pérez
cd8370b408
ggml-cpu: aarm64: q4_K repack gemm and gemv implementations (dotprod only) ( #17494 )
...
* Enabled q4_K_4x8 path
* Fixed generic Q4_K 8x4 implementation
* wip: dotprod gemm
* Working arm q4_K dotprod gemm
Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>
* Undo acc rename
Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>
* Q4_K arm dotprod gemm
Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>
* Fix: q4_qs reinterpret from uint to int
Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>
* Removed comments
* Fixed macro guards
* Fixed unused vars in generic implementation
* Fixed unused vars in 8x4 repack
* Fixed unused vars in generic implementation, unneeded comment
* Missing arch fallback for x86
* minor : style
---------
Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-11-27 13:25:14 +02:00
Eric Curtin
d21a76ac38
devops: Add build-essential to Ubuntu 26.04 image ( #17531 )
...
This is no longer passing the build, needs more packages.
Signed-off-by: Eric Curtin <eric.curtin@docker.com>
2025-11-27 18:35:47 +08:00
Aleksei Nikiforov
4fcd87cf7c
gguf-py : skip endian-conversion of MXFP4 data ( #17523 )
...
* gguf_convert_endian.py: skip MXFP4 data
* Use gguf.constants.GGML_QUANT_SIZES to determine block sizes
2025-11-27 11:35:38 +01:00
Aleksander Grygier
69065ddc56
fix: UI
2025-11-27 11:27:58 +01:00
Aleksander Grygier
6b95118abc
refactor: Processing state reactivity
2025-11-27 11:11:45 +01:00
Acly
b78db3bd50
vulkan : move contiguous checks to device_supports_op ( #17490 )
...
* vulkan : remove op_supports_incontiguous and add missing constraints in device_supports_op
* im2col: remove contraints on src0 (kernel input)
2025-11-27 06:54:19 +01:00
Jeff Bolz
142df17c9c
vulkan: use a fixed 1KB buffer for the add_rms_fusion opt ( #17514 )
2025-11-27 06:32:30 +01:00