Commit Graph

7828 Commits

Author SHA1 Message Date
nwyin e443fbcfa5
ggml webgpu: add CEIL operation support (#18605)
* ggml-webgpu: add CEIL operation support

      Add support for the CEIL unary operation in the WebGPU backend:
      - Add CEIL_FUNC shader template in unary_op.wgsl
      - Add 4 shader variants (f32, f16, inplace versions)
      - Initialize CEIL pipelines in ggml-webgpu.cpp
      - Register CEIL in supports_op function

* docs: update WebGPU ops support for CEIL
2026-01-05 11:38:57 -08:00
Tarek Dakhran 73d284a250
model : add LFM2-ColBert-350M (#18607)
* model : add LFM2-ColBert-350M

* llama_model_n_embd_out() - returns `hparams.n_embd_out` if set and fallbacks to `hparams.n_embd`
2026-01-05 19:52:56 +01:00
Johannes Gäßler df17a4c94f
CUDA: fix FA FP16 accumulator overflow for Granite (#18614) 2026-01-05 19:51:13 +01:00
tt 1871f0ba56
add YoutuVLForConditionalGeneration architectures (#18620)
* Support Youtu-VL Model
---------

Co-authored-by: Xuan-Son Nguyen <son@huggingface.co>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-01-05 18:15:14 +01:00
Aman Gupta f47edb8c19
ggml-cuda: check for srcs outside the cgraph (#18583)
* ggml-cuda: check for srcs outside the cgraph

* review: use leafs instead
2026-01-05 22:46:36 +08:00
Aleksander Grygier 2d6020b574 feat: Enable adding System Prompt per-chat 2026-01-05 14:30:11 +01:00
Vladislav Sayapin da143b9940
server : fix router child env in containerized environments (#18562) 2026-01-05 14:12:05 +01:00
Aleksander Grygier 469263668f fix: UI 2026-01-05 11:59:31 +01:00
Aleksander Grygier cf37390434 chore: update webui build output 2026-01-05 11:57:23 +01:00
Aleksander Grygier f3734b5b7c feat: UI improvements 2026-01-05 11:53:53 +01:00
Jeff Bolz f1768d8f03
vulkan: fix topk_moe_sigmoid_norm_bias failures in GLM-4.6 (#18582) 2026-01-05 11:51:39 +01:00
Georgi Gerganov 2da64a2f8a
models : fix backend assignment for Granite/Nemotron graphs (#18599)
* models : fix backend assignment for Granite/Nemotron graphs

* cont : add ref

* cont : move call to build_inp_embd()
2026-01-05 12:34:23 +02:00
Jeff Bolz b37124d2d2
vulkan: handle quantize_q8_1 overflowing the max workgroup count (#18515)
* vulkan: handle quantize_q8_1 overflowing the max workgroup count

* vulkan: Fix small tile size matmul on lavapipe

* fix mul_mat_id failures
2026-01-05 11:30:14 +01:00
Sigbjørn Skjæret eadc4184ca
llama : refactor rope_freq_base/scale_swa conversion and init (#18553)
* refactor rope_freq_base/scale_swa conversion and init

* safe defaults for unknowns

* update relevant models

* grammar

* add get_rope_freq_scale to modern-bert

* const

* const

* log swa info
2026-01-05 09:14:04 +01:00
Pascal 653f85fedd webui: raw tool result display, strip only leading/trailing newlines to preserve indentation 2026-01-05 09:01:31 +01:00
Pascal fc7218ae11 webui: split raw output into backend parsing and frontend display options 2026-01-05 09:01:31 +01:00
Pascal 4f9d9d41b9 webui: remove legacy wrapper and restore WebSocket transport 2026-01-05 09:01:31 +01:00
Pascal 183d9eebff webui: remove unused imports 2026-01-05 09:01:31 +01:00
Aleksander Grygier f7ea69fa18 chore: update webui build output 2026-01-05 09:01:31 +01:00
Aleksander Grygier c5d01fbb8f feat: Improve agentic tool call streaming display with 'in progress' state 2026-01-05 09:01:31 +01:00
Aleksander Grygier f755673c6f feat: Enhance MCP server dropdown with search, popularity sorting, and per-chat overrides 2026-01-05 09:01:31 +01:00
Aleksander Grygier 81ad2d5569 feat: Add per-chat MCP server overrides 2026-01-05 09:01:31 +01:00
Aleksander Grygier 865c28a96d chore: update webui build output 2026-01-05 09:01:31 +01:00
Aleksander Grygier 2592471d11 feat: Add image load error fallback in MarkdownContent 2026-01-05 09:01:31 +01:00
Aleksander Grygier 069be7b517 feat: Implement lazy MCP client shutdown 2026-01-05 09:01:31 +01:00
Aleksander Grygier 9571e07687 feat: Enhance tool call streaming UI and output format 2026-01-05 09:01:31 +01:00
Aleksander Grygier 260375819d feat: Display and manage servers in ChatForm actions 2026-01-05 09:01:31 +01:00
Aleksander Grygier 74345d8785 feat: Integrate server management dialog into chat settings 2026-01-05 09:01:31 +01:00
Aleksander Grygier dde5e1582c feat: Implement dedicated server management UI components 2026-01-05 09:01:31 +01:00
Aleksander Grygier c24d5e36f0 refactor: Centralize health check logic in store 2026-01-05 09:01:31 +01:00
Aleksander Grygier f87b10ee66 feat: Enhance server config with headers and schema normalization 2026-01-05 09:01:31 +01:00
Aleksander Grygier 778ad550b1 feat: Add McpLogo Svelte component 2026-01-05 09:01:31 +01:00
Aleksander Grygier c1c2234a62 refactor: Consolidate UI CSS classes into shared module 2026-01-05 09:01:31 +01:00
Aleksander Grygier 883d2a4f15 chore: update webui build output 2026-01-05 09:01:31 +01:00
Aleksander Grygier 7d5fd37324 feat: Raw LLM output switch per message 2026-01-05 09:01:31 +01:00
Aleksander Grygier 03464a0780 refactor: Tool call handling 2026-01-05 09:01:31 +01:00
Aleksander Grygier 3e7318f09d docs: Update high-level architecture diagrams for MCP integration 2026-01-05 09:01:15 +01:00
Aleksander Grygier 219be7807e feat: Add AgenticContent component for enhanced tool call rendering 2026-01-05 09:01:15 +01:00
Aleksander Grygier 52b1a1bffa refactor: Update ChatStore to leverage mcpStore for agentic flow 2026-01-05 09:01:15 +01:00
Aleksander Grygier 60475dca3c feat: Implement agentic orchestration within ChatService 2026-01-05 09:01:15 +01:00
Aleksander Grygier 5f5d5ab45f feat: Introduce reactive mcpStore for client lifecycle management 2026-01-05 09:01:15 +01:00
Aleksander Grygier 9ab2326e79 feat: Refactor MCP client to use official SDK 2026-01-05 09:01:15 +01:00
Aleksander Grygier 4dbcb5cdfd feat: Add @modelcontextprotocol/sdk and zod dependencies 2026-01-05 09:01:15 +01:00
Aleksander Grygier 8024ae540f refactor: Update Agentic and MCP config parsing to use new utils and constants 2026-01-05 09:01:15 +01:00
Aleksander Grygier abc3764c9f feat: Centralize MCP and Agentic type definitions and constants 2026-01-05 09:01:15 +01:00
Aleksander Grygier 94fef3508a feat: Introduce common utility functions 2026-01-05 09:01:15 +01:00
Pascal 18ee0acb3e webui: use normalizedMessages after upstream refactor 2026-01-05 09:00:59 +01:00
Pascal d4207ddd8a webui: MCP client with low coupling to current codebase 2026-01-05 09:00:59 +01:00
Chenguang Li 67e3f6f601
CANN: add operator fusion support for ADD + RMS_NORM (#17512)
This commit implements operator fusion for ADD + RMS_NORM operations
in the CANN backend to reduce memory access overhead and improve
performance. The fusion is controlled by the GGML_CANN_OPERATOR_FUSION
environment variable (default: false).

Changes:
- Implement ggml_cann_op_add_rms_norm_fused() using ACLNN AddRmsNorm
- Add ggml_cann_can_fuse() to check fusion eligibility
- Integrate fusion logic into computation graph evaluation
- Add test cases for ADD + RMS_NORM fusion
- Update documentation with new environment variable

The fusion combines ADD and RMS_NORM into a single kernel call,
which is more efficient than executing them separately.
2026-01-05 15:38:18 +08:00
Francisco Herrera 92ac1e016b
doc: clarify that steps also apply to linux for opencl (#18002)
* Clarify setup steps for Linux 

Added note that setup steps apply to Linux as well.

* Added note for backtick replacement

* clarify that backtick replacement only applies on linux

* clarified Linux specific steps

So actually some changes are needed for Linux but they are minor.

* clarify change execution

* clarify by placing info after steps

* clarify which steps

* Make instructions consistent across OSes

* Rm whitespace

* Update docs/backend/OPENCL.md

Co-authored-by: Aaron Teo <taronaeo@gmail.com>

* Update docs/backend/OPENCL.md

Co-authored-by: Aaron Teo <taronaeo@gmail.com>

* Update docs/backend/OPENCL.md

Co-authored-by: Aaron Teo <taronaeo@gmail.com>

---------

Co-authored-by: Aaron Teo <taronaeo@gmail.com>
2026-01-04 20:39:25 -08:00