nwyin
e443fbcfa5
ggml webgpu: add CEIL operation support ( #18605 )
...
* ggml-webgpu: add CEIL operation support
Add support for the CEIL unary operation in the WebGPU backend:
- Add CEIL_FUNC shader template in unary_op.wgsl
- Add 4 shader variants (f32, f16, inplace versions)
- Initialize CEIL pipelines in ggml-webgpu.cpp
- Register CEIL in supports_op function
* docs: update WebGPU ops support for CEIL
2026-01-05 11:38:57 -08:00
Tarek Dakhran
73d284a250
model : add LFM2-ColBert-350M ( #18607 )
...
* model : add LFM2-ColBert-350M
* llama_model_n_embd_out() - returns `hparams.n_embd_out` if set and fallbacks to `hparams.n_embd`
2026-01-05 19:52:56 +01:00
Johannes Gäßler
df17a4c94f
CUDA: fix FA FP16 accumulator overflow for Granite ( #18614 )
2026-01-05 19:51:13 +01:00
tt
1871f0ba56
add YoutuVLForConditionalGeneration architectures ( #18620 )
...
* Support Youtu-VL Model
---------
Co-authored-by: Xuan-Son Nguyen <son@huggingface.co>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-01-05 18:15:14 +01:00
Aman Gupta
f47edb8c19
ggml-cuda: check for srcs outside the cgraph ( #18583 )
...
* ggml-cuda: check for srcs outside the cgraph
* review: use leafs instead
2026-01-05 22:46:36 +08:00
Aleksander Grygier
2d6020b574
feat: Enable adding System Prompt per-chat
2026-01-05 14:30:11 +01:00
Vladislav Sayapin
da143b9940
server : fix router child env in containerized environments ( #18562 )
2026-01-05 14:12:05 +01:00
Aleksander Grygier
469263668f
fix: UI
2026-01-05 11:59:31 +01:00
Aleksander Grygier
cf37390434
chore: update webui build output
2026-01-05 11:57:23 +01:00
Aleksander Grygier
f3734b5b7c
feat: UI improvements
2026-01-05 11:53:53 +01:00
Jeff Bolz
f1768d8f03
vulkan: fix topk_moe_sigmoid_norm_bias failures in GLM-4.6 ( #18582 )
2026-01-05 11:51:39 +01:00
Georgi Gerganov
2da64a2f8a
models : fix backend assignment for Granite/Nemotron graphs ( #18599 )
...
* models : fix backend assignment for Granite/Nemotron graphs
* cont : add ref
* cont : move call to build_inp_embd()
2026-01-05 12:34:23 +02:00
Jeff Bolz
b37124d2d2
vulkan: handle quantize_q8_1 overflowing the max workgroup count ( #18515 )
...
* vulkan: handle quantize_q8_1 overflowing the max workgroup count
* vulkan: Fix small tile size matmul on lavapipe
* fix mul_mat_id failures
2026-01-05 11:30:14 +01:00
Sigbjørn Skjæret
eadc4184ca
llama : refactor rope_freq_base/scale_swa conversion and init ( #18553 )
...
* refactor rope_freq_base/scale_swa conversion and init
* safe defaults for unknowns
* update relevant models
* grammar
* add get_rope_freq_scale to modern-bert
* const
* const
* log swa info
2026-01-05 09:14:04 +01:00
Pascal
653f85fedd
webui: raw tool result display, strip only leading/trailing newlines to preserve indentation
2026-01-05 09:01:31 +01:00
Pascal
fc7218ae11
webui: split raw output into backend parsing and frontend display options
2026-01-05 09:01:31 +01:00
Pascal
4f9d9d41b9
webui: remove legacy wrapper and restore WebSocket transport
2026-01-05 09:01:31 +01:00
Pascal
183d9eebff
webui: remove unused imports
2026-01-05 09:01:31 +01:00
Aleksander Grygier
f7ea69fa18
chore: update webui build output
2026-01-05 09:01:31 +01:00
Aleksander Grygier
c5d01fbb8f
feat: Improve agentic tool call streaming display with 'in progress' state
2026-01-05 09:01:31 +01:00
Aleksander Grygier
f755673c6f
feat: Enhance MCP server dropdown with search, popularity sorting, and per-chat overrides
2026-01-05 09:01:31 +01:00
Aleksander Grygier
81ad2d5569
feat: Add per-chat MCP server overrides
2026-01-05 09:01:31 +01:00
Aleksander Grygier
865c28a96d
chore: update webui build output
2026-01-05 09:01:31 +01:00
Aleksander Grygier
2592471d11
feat: Add image load error fallback in MarkdownContent
2026-01-05 09:01:31 +01:00
Aleksander Grygier
069be7b517
feat: Implement lazy MCP client shutdown
2026-01-05 09:01:31 +01:00
Aleksander Grygier
9571e07687
feat: Enhance tool call streaming UI and output format
2026-01-05 09:01:31 +01:00
Aleksander Grygier
260375819d
feat: Display and manage servers in ChatForm actions
2026-01-05 09:01:31 +01:00
Aleksander Grygier
74345d8785
feat: Integrate server management dialog into chat settings
2026-01-05 09:01:31 +01:00
Aleksander Grygier
dde5e1582c
feat: Implement dedicated server management UI components
2026-01-05 09:01:31 +01:00
Aleksander Grygier
c24d5e36f0
refactor: Centralize health check logic in store
2026-01-05 09:01:31 +01:00
Aleksander Grygier
f87b10ee66
feat: Enhance server config with headers and schema normalization
2026-01-05 09:01:31 +01:00
Aleksander Grygier
778ad550b1
feat: Add McpLogo Svelte component
2026-01-05 09:01:31 +01:00
Aleksander Grygier
c1c2234a62
refactor: Consolidate UI CSS classes into shared module
2026-01-05 09:01:31 +01:00
Aleksander Grygier
883d2a4f15
chore: update webui build output
2026-01-05 09:01:31 +01:00
Aleksander Grygier
7d5fd37324
feat: Raw LLM output switch per message
2026-01-05 09:01:31 +01:00
Aleksander Grygier
03464a0780
refactor: Tool call handling
2026-01-05 09:01:31 +01:00
Aleksander Grygier
3e7318f09d
docs: Update high-level architecture diagrams for MCP integration
2026-01-05 09:01:15 +01:00
Aleksander Grygier
219be7807e
feat: Add AgenticContent component for enhanced tool call rendering
2026-01-05 09:01:15 +01:00
Aleksander Grygier
52b1a1bffa
refactor: Update ChatStore to leverage mcpStore for agentic flow
2026-01-05 09:01:15 +01:00
Aleksander Grygier
60475dca3c
feat: Implement agentic orchestration within ChatService
2026-01-05 09:01:15 +01:00
Aleksander Grygier
5f5d5ab45f
feat: Introduce reactive mcpStore for client lifecycle management
2026-01-05 09:01:15 +01:00
Aleksander Grygier
9ab2326e79
feat: Refactor MCP client to use official SDK
2026-01-05 09:01:15 +01:00
Aleksander Grygier
4dbcb5cdfd
feat: Add @modelcontextprotocol/sdk and zod dependencies
2026-01-05 09:01:15 +01:00
Aleksander Grygier
8024ae540f
refactor: Update Agentic and MCP config parsing to use new utils and constants
2026-01-05 09:01:15 +01:00
Aleksander Grygier
abc3764c9f
feat: Centralize MCP and Agentic type definitions and constants
2026-01-05 09:01:15 +01:00
Aleksander Grygier
94fef3508a
feat: Introduce common utility functions
2026-01-05 09:01:15 +01:00
Pascal
18ee0acb3e
webui: use normalizedMessages after upstream refactor
2026-01-05 09:00:59 +01:00
Pascal
d4207ddd8a
webui: MCP client with low coupling to current codebase
2026-01-05 09:00:59 +01:00
Chenguang Li
67e3f6f601
CANN: add operator fusion support for ADD + RMS_NORM ( #17512 )
...
This commit implements operator fusion for ADD + RMS_NORM operations
in the CANN backend to reduce memory access overhead and improve
performance. The fusion is controlled by the GGML_CANN_OPERATOR_FUSION
environment variable (default: false).
Changes:
- Implement ggml_cann_op_add_rms_norm_fused() using ACLNN AddRmsNorm
- Add ggml_cann_can_fuse() to check fusion eligibility
- Integrate fusion logic into computation graph evaluation
- Add test cases for ADD + RMS_NORM fusion
- Update documentation with new environment variable
The fusion combines ADD and RMS_NORM into a single kernel call,
which is more efficient than executing them separately.
2026-01-05 15:38:18 +08:00
Francisco Herrera
92ac1e016b
doc: clarify that steps also apply to linux for opencl ( #18002 )
...
* Clarify setup steps for Linux
Added note that setup steps apply to Linux as well.
* Added note for backtick replacement
* clarify that backtick replacement only applies on linux
* clarified Linux specific steps
So actually some changes are needed for Linux but they are minor.
* clarify change execution
* clarify by placing info after steps
* clarify which steps
* Make instructions consistent across OSes
* Rm whitespace
* Update docs/backend/OPENCL.md
Co-authored-by: Aaron Teo <taronaeo@gmail.com>
* Update docs/backend/OPENCL.md
Co-authored-by: Aaron Teo <taronaeo@gmail.com>
* Update docs/backend/OPENCL.md
Co-authored-by: Aaron Teo <taronaeo@gmail.com>
---------
Co-authored-by: Aaron Teo <taronaeo@gmail.com>
2026-01-04 20:39:25 -08:00