Sigbjørn Skjæret
de8f01c2d7
model : wire up Nemotron-H tensors for NVFP4 support ( #20561 )
...
* wire up Nemotron-H tensors for NVFP4 support
* add ssm tensors
* alignment
2026-03-16 09:19:16 +01:00
Georgi Gerganov
1274fbee9e
models : fix assert in mamba2 (cont) ( #20335 )
...
* models : fix assert in mamba2 (cont)
* cont : add n_group mod
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-03-10 15:00:08 +02:00
Georgi Gerganov
43e1cbd6c1
models : fix assert in mamba2 graph ( #20270 )
2026-03-09 13:15:15 +02:00
Johannes Gäßler
a976ff081b
llama: end-to-end tests ( #19802 )
...
* tests: add end-to-end tests per model architecture
* fixup for rebase
* fix use-after-free in llama-model-loader.cpp
* fix CI
* fix WebGPU
* fix CI
* disable CI for macOS-latest-cmake-arm64
* use expert_weights_scale only if != 0.0f
* comments
2026-03-08 12:30:21 +01:00
Georgi Gerganov
cc45f2ada6
models : deduplicate delta-net graphs for Qwen family ( #19597 )
...
* models : add llm_build_delta_net_base
* cont : keep qwen35 and qwen35moe graphs intact
* cont : add comments
2026-02-16 14:35:04 +02:00