Aleksander Grygier
fa0cad2e6e
refactor: Componentize Chat Form Prompt Picker
2026-01-26 09:36:13 +01:00
Aleksander Grygier
176abf3175
refactor: Utility function
2026-01-26 09:00:41 +01:00
Aleksander Grygier
5ee232d81c
refactor: Use store methods
2026-01-26 08:52:57 +01:00
Aleksander Grygier
ff0e927be2
chore: update webui build output
2026-01-25 13:38:25 +01:00
Aleksander Grygier
ee9efae203
refactor: Enums
2026-01-25 13:37:08 +01:00
Aleksander Grygier
7f5284d597
refactor: Cleanup
...
refactor: Cleanup
refactor: Cleanup
refactor: Cleanup
2026-01-25 13:13:11 +01:00
Aleksander Grygier
97642211a9
chore: update webui build output
2026-01-25 02:10:25 +01:00
Aleksander Grygier
fc377123b7
refactor: Simplify MCP errors
2026-01-25 02:09:12 +01:00
Aleksander Grygier
202262c2dc
chore: update webui build output
2026-01-25 01:44:14 +01:00
Aleksander Grygier
b58b823b57
refactor: Types
2026-01-25 01:39:49 +01:00
Aleksander Grygier
ba39f8cc7b
chore: update webui build output
2026-01-25 01:21:34 +01:00
Aleksander Grygier
9bcfdc3483
refactor: DRY
2026-01-25 01:17:59 +01:00
Aleksander Grygier
e7ff091881
chore: Add deprecation comment
2026-01-25 01:05:28 +01:00
Aleksander Grygier
1c843b2863
chore: update webui build output
2026-01-25 01:04:34 +01:00
Aleksander Grygier
5dfc520d67
refactor: Cleanup
2026-01-25 00:48:21 +01:00
Aleksander Grygier
6daa39994c
refactor: Naming & Enums
2026-01-25 00:32:37 +01:00
Aleksander Grygier
2562dc50bd
chore: update webui build output
2026-01-25 00:32:16 +01:00
Aleksander Grygier
372202632e
refactor: Cleanup
2026-01-25 00:31:49 +01:00
Aleksander Grygier
ba230c5cce
refactor: Naming + remove redundant component
2026-01-24 23:58:17 +01:00
Aleksander Grygier
f7b5f62586
refactor: Remove unused code
2026-01-24 23:45:06 +01:00
Aleksander Grygier
22d9e645aa
chore: update webui build output
2026-01-24 23:39:04 +01:00
Aleksander Grygier
d938994395
refactor: Cleanup
2026-01-24 23:38:37 +01:00
Aleksander Grygier
fc4c392dce
chore: update webui build output
2026-01-24 20:54:24 +01:00
Aleksander Grygier
79e606eb99
refactor: Constants
2026-01-24 20:52:19 +01:00
Aleksander Grygier
3d7426cdd4
refactor: Cleanup
2026-01-24 20:47:32 +01:00
Aleksander Grygier
8bf2d38da1
chore: update webui build output
2026-01-24 20:32:53 +01:00
Aleksander Grygier
14911e51fc
feat: MCP Prompts implementation improvements
2026-01-24 20:30:52 +01:00
Aleksander Grygier
801ef93522
refactor: Message Height CSS Variable
2026-01-24 19:15:38 +01:00
Aleksander Grygier
13f756421c
refactor: Enums
2026-01-24 18:37:43 +01:00
Pascal
85b8da45f9
fix: resolve TypeScript error in tool response content
2026-01-24 18:04:01 +01:00
Pascal
9ddc54b668
webui: enable vision in agentic tool responses
...
- Include images from all message roles (not just user)
- Add multipart content support for tool responses
- Images from MCP tools now accessible in same agentic turn
2026-01-24 17:58:20 +01:00
Aleksander Grygier
172e93d494
Merge remote-tracking branch 'ggml-org/master' into allozaur/mcp-mvp
2026-01-24 15:13:58 +01:00
Aleksander Grygier
da9c245838
chore: update webui build output
2026-01-24 13:59:52 +01:00
Aleksander Grygier
7c4bedda87
feat: Improve formatting performance time
2026-01-24 13:58:23 +01:00
Aleksander Grygier
c39c6ef436
fix: System prompt sorting
2026-01-24 13:44:41 +01:00
Aleksander Grygier
2601bf0f59
fix: Save draft message in Chat Form when adding System Prompt from new chat view
2026-01-24 13:32:49 +01:00
Aleksander Grygier
a647edfc0b
fix: Chat Form submission
2026-01-24 12:33:24 +01:00
Johannes Gäßler
8f91ca54ec
CUDA: re-use MLA K data for V in MMA FA ( #19057 )
2026-01-24 10:09:36 +01:00
Aman Gupta
81ab64f3c8
ggml-cuda: enable cuda-graphs for `n-cpu-moe` ( #18934 )
...
* ggml-cuda: add split-wise cuda graph
* add n-cpu-moe compare_llama_bench.py
* fix hip/musa builds
2026-01-24 14:25:20 +08:00
nullname
8af1f5f430
ggml-hexagon: flash-attn opt ( #19025 )
...
* optimize flash attention kernel by improving score computation and online softmax update
* wip
* Refactor online softmax update in flash attention kernel for improved performance
* Optimize flash attention kernel by replacing float array with HVX_Vector for score computation
* wip
2026-01-23 22:02:07 -08:00
Aleksander Grygier
bd16b6145c
chore: update webui build output
2026-01-24 01:32:36 +01:00
Aleksander Grygier
8428741034
feat: MCP Prompts WIP
2026-01-24 01:26:17 +01:00
Georgi Gerganov
557515be1e
graph : utilize `ggml_build_forward_select()` to avoid reallocations ( #18898 )
...
* graph : avoid branches between embedding and token inputs
* models : make deepstack graphs (e.g. Qwen3 VL) have constant topology
* ci : enable -DGGML_SCHED_NO_REALLOC=ON for server CI
* cont : pad token embeddings to n_embd_inp
2026-01-23 18:22:34 +02:00
Aleksander Grygier
3d88d0b6b2
chore: update webui build output
2026-01-23 15:21:56 +01:00
Aleksander Grygier
9c391d8e0d
feat: UI improvements
2026-01-23 15:21:03 +01:00
Neo Zhang
cb6caca191
[SYCL] use malloc to support both iGPU and dGPU in same time ( #18992 )
...
* use malloc to support both iGPU and dGPU in same time
* support windows
---------
Co-authored-by: Neo Zhang Jianyu <jianyu.zhang@intel.com>
2026-01-23 20:54:10 +08:00
Xuan-Son Nguyen
b5b8fa1c8b
chat : fix translategemma crash on common_chat_format_example ( #19019 )
2026-01-23 12:03:42 +01:00
Daniel Bevenius
a14b960bc7
model-conversion : use BUILD_DIR variable in all scripts ( #19015 )
...
This commit modifies all the utility scripts to use an optional
BUILD_DIR variable/argument to specify the build directory.
The motivation for this is that Commit
3d55846a5c ("model-conversion : add
BUILD_DIR variable to run-converted-model scripts") introduced this
variable to the causal and embeddings scripts, but I missed the scripts
in the utils directory.
2026-01-23 09:01:36 +01:00
Alberto Cabrera Pérez
091a46cb8d
ggml-cpu: aarm64: q5_K repack gemm and gemv (and generic) implementations (i8mm) ( #18860 )
...
* Boilerplate for q5_Kx8 REPACK on ARM and fallback
Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>
* Implements make_block_q5_Kx8 by extending make_block_q4_Kx8
Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>
* q5_K repack gemm and gemv generics
* Gemm and Gemv ARM implementations (i8mm)
* Improved qh manipulation looking at non-repack vec_dot implementation
* Full unroll
* Apply Q5_K Gemv vand and vshl optimizations to gemm. Improve comments.
Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>
* Fix wrong fallback definitions of Q5_K
Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>
* Fixed comments. Reverted unnecessary formatting
Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>
* Fixed typo in generic definitions
* Switching AND + Shift with Shift Insert. Better op interleaving.
* Vectorize + unroll the block scales
* Apply gemm optimizations to gemv
* Improve bias calculation
---------
Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>
2026-01-23 09:55:08 +02:00
Aldehir Rojas
a3e812811d
cli : load parser definition ( #19031 )
...
* cli : load parser definition
* cont : only unload if a parser is defined
2026-01-22 20:31:22 -06:00