Xuan-Son Nguyen
59db9a357d
llama: dynamic head_dim and n_rot for SWA ( #20301 )
...
* llama: dynamic head_dim and n_rot for SWA
* also add gguf_writer wrappers
* fix build
* build_rope_shift arg reorder
2026-03-09 22:22:39 +01:00
Sigbjørn Skjæret
35bee031e1
graph : remove redundant scale_w parameter ( #20235 )
2026-03-08 18:58:28 +01:00
Johannes Gäßler
a976ff081b
llama: end-to-end tests ( #19802 )
...
* tests: add end-to-end tests per model architecture
* fixup for rebase
* fix use-after-free in llama-model-loader.cpp
* fix CI
* fix WebGPU
* fix CI
* disable CI for macOS-latest-cmake-arm64
* use expert_weights_scale only if != 0.0f
* comments
2026-03-08 12:30:21 +01:00
Junwon Hwang
60591f01d4
model : add EXAONE MoE ( #18543 )
...
* Add EXAONE MoE implementations
Co-authored-by: Junwon Hwang <nuclear1221@gmail.com>
* Address PR feedback
* Address PR feedback
* [WIP] Add MTP for EXAONE-MoE
* Address PR feedback
* Address PR feedback
* Address PR feedback
* Address PR feedback
* Address PR feedback
* Address PR feedback
* Address PR feedback
---------
Co-authored-by: LG-AI-EXAONE <exaonemodels@lgresearch.ai>
2026-01-13 23:28:38 +01:00