Commit Graph

3 Commits

Author SHA1 Message Date
Xuan-Son Nguyen 59db9a357d
llama: dynamic head_dim and n_rot for SWA (#20301)
* llama: dynamic head_dim and n_rot for SWA

* also add gguf_writer wrappers

* fix build

* build_rope_shift arg reorder
2026-03-09 22:22:39 +01:00
Sigbjørn Skjæret 35bee031e1
graph : remove redundant scale_w parameter (#20235) 2026-03-08 18:58:28 +01:00
Xuan-Son Nguyen cd3c118908
model: support Ministral3 (#17644)
* conversion script

* support ministral 3

* maybe this is better?

* add TODO for rope_yarn_log_mul

* better ppl (tested on 14B-Instruct)

* Add Ministral3 support to Mistral format

* improve arch handling

* add sizes

* Apply suggestions from code review

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* nits

---------

Co-authored-by: Julien Denize <julien.denize@mistral.ai>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-12-01 12:26:52 +01:00