Georgi Gerganov
7bed317f53
models : fix the attn_factor for mistral3 graphs + improve consistency ( #17945 )
...
* models : fix the attn_factor for mistral3 graphs
* cont : rework attn_factor correction logic
* cont : make deepseek2 consistent
* cont : add TODO
* cont : special-case DSv2
* cont : revert Mistral 3 Large changes
* cont : fix DS2 to use the original attn_factor
* cont : minor comments
2025-12-12 17:12:40 +02:00
Xuan-Son Nguyen
4d3726278b
model: add llama 4 scaling for mistral-large (deepseek arch) ( #17744 )
2025-12-07 22:29:54 +01:00
ubergarm
23bc779a6e
model : detect GigaChat3-10-A1.8B as deepseek lite ( #17420 )
...
* Detect GigaChat3-10-A1.8B as deepseek lite
Hardcodes checking number of layers to detect if lite version of deepseek.
* Add commnent identifying deepseek lite variants
deepseek lite variants include DeepSeek-V2-Lite, GigaChat3-10B-A1.8B
2025-11-21 14:51:38 +01:00
Piotr Wilkin (ilintar)
bea04522ff
refactor : llama-model.cpp ( #16252 )
...
* Sqashed: llama-model.cpp refactoring
* Fix formatting of attn / ffn / ffn_moe calls
* Fix import regression / unify spacing in models.h
* totally DID NOT miss those!
* Add missing qwen3vl(moe) models
* Add missing new .cpp files to build
* Remove extra semicolons
* Editor checker
* Update src/models/models.h
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-10-31 23:40:23 +01:00