HelloKS
9d52f17ae3
model : add KORMo model ( #18032 )
...
* vocab: add KORMo Tokenizer
* model: add KORMoForCausalLM
* vocab: change pretokenizer to qwen2
* lint: fix unintended line removal
* model: make qwen2 bias tensor optional
* model: use qwen2 architecture for KORMo
2025-12-15 18:51:43 +01:00
Georgi Gerganov
7bed317f53
models : fix the attn_factor for mistral3 graphs + improve consistency ( #17945 )
...
* models : fix the attn_factor for mistral3 graphs
* cont : rework attn_factor correction logic
* cont : make deepseek2 consistent
* cont : add TODO
* cont : special-case DSv2
* cont : revert Mistral 3 Large changes
* cont : fix DS2 to use the original attn_factor
* cont : minor comments
2025-12-12 17:12:40 +02:00
philip-essential
1d2a1ab73d
model : support Rnj-1 ( #17811 )
...
* add support for rnj1
* refactor gemma3 to support rnj-1
* address review comments
2025-12-09 04:49:03 +01:00
Xuan-Son Nguyen
4d3726278b
model: add llama 4 scaling for mistral-large (deepseek arch) ( #17744 )
2025-12-07 22:29:54 +01:00
Xuan-Son Nguyen
cd3c118908
model: support Ministral3 ( #17644 )
...
* conversion script
* support ministral 3
* maybe this is better?
* add TODO for rope_yarn_log_mul
* better ppl (tested on 14B-Instruct)
* Add Ministral3 support to Mistral format
* improve arch handling
* add sizes
* Apply suggestions from code review
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* nits
---------
Co-authored-by: Julien Denize <julien.denize@mistral.ai>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-12-01 12:26:52 +01:00
Piotr Wilkin (ilintar)
ff55414c42
model : Qwen3 Next ( #16095 )
...
* Qwen3 Next - cleaned up version
* Whitespaces and stuff
* Correct minor errors
* Update src/llama-model.cpp
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Misc. fixes.
* Clean up code, add missing hybrid qualifier
* Did someone transpose the SOLVE_TRI result matrix? Perhaps...
* Whitespace
* Proper tensors for cb calls
* Use llama-graph.h vertical alignment
* BROKEN: chunking
* Set new tensors as inputs.
* Proper chunk logic
* It's the circle of life...
* More shenanigans for n_seq > 1
* Nail in the coffin?
* Fix Windows build
* Eh, one fails on Windows, the other fails on Mac... just use general capture.
* quant : cleanup
* model : cleanup
* qwen3 : cleanup
* cont : cleanup
* cont : cleanup
* ggml : revert change
* qwen3 : cleanup
* cont : cleanup
* Readd cmath
* qwen3 : fix typo
* Update convert_hf_to_gguf.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Usual suspects
* fix my bad suggestion
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-11-28 12:02:56 +01:00
Georgi Gerganov
6783b11fb0
models : fix LFM2 tensors ( #17548 )
2025-11-27 16:04:29 +02:00
william pan
4902eebe33
models : Added support for RND1 Diffusion Language Model ( #17433 )
...
* Converted RND1 model to GGUF weights
* RND1 llama.cpp support v1
* RND1 llama.cpp support v2 non causal bug
* RND1 llama.cpp support v3 doccumentation
* RND1 llama.cpp support v4 clean code
* linting issues
* RND1 pr fixes v1
* RND1 pr fixes v2
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Diffusion documentation edits
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-11-24 14:16:56 +08:00
ubergarm
23bc779a6e
model : detect GigaChat3-10-A1.8B as deepseek lite ( #17420 )
...
* Detect GigaChat3-10-A1.8B as deepseek lite
Hardcodes checking number of layers to detect if lite version of deepseek.
* Add commnent identifying deepseek lite variants
deepseek lite variants include DeepSeek-V2-Lite, GigaChat3-10B-A1.8B
2025-11-21 14:51:38 +01:00
Bartowski
e1fcf8b09b
model : add AfmoeForCausalLM support ( #16477 )
...
* Add AFMOE model support
* Update to vocab
* Add model sizing
* Undo Rope change for ARCEE model
* Address review comments
* Update modeling code is_sliding -> use_rope, replace hard-coded logic
* Fix AFMOE tokenizer
* Update convert_hf_to_gguf.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Update convert_hf_to_gguf.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Update AFMoE tokenizer class identification to be more unique
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-11-14 13:54:10 +01:00
Sigbjørn Skjæret
7bef684118
models : move build_inp_out_ids outside loop ( #17151 )
...
* move build_inp_out_ids outside loop
* realign
2025-11-10 22:55:30 +01:00
Sigbjørn Skjæret
9008027aa3
hparams : add n_embd_inp() to support extended embed ( #16928 )
...
* add n_embd_full to support extended embed
* don't change output
* rename to n_embd_inp
* restore n_embd where applicable
2025-11-07 19:27:58 +01:00
Li Pengzhan
9f052478c2
model : add openPangu-Embedded ( #16941 )
...
* Model: add openPangu-Embedded
* fixed according to reviewer's comments
* fixed the chat template check condition
* Apply suggestions from code review
change the chat-template check condition and some formatting issue
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* whitespace cleanup
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-11-05 10:28:58 +01:00
Sigbjørn Skjæret
b164259bba
chore : fix models indent after refactor ( #16992 )
2025-11-04 12:29:15 +01:00
Piotr Wilkin (ilintar)
bea04522ff
refactor : llama-model.cpp ( #16252 )
...
* Sqashed: llama-model.cpp refactoring
* Fix formatting of attn / ffn / ffn_moe calls
* Fix import regression / unify spacing in models.h
* totally DID NOT miss those!
* Add missing qwen3vl(moe) models
* Add missing new .cpp files to build
* Remove extra semicolons
* Editor checker
* Update src/models/models.h
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-10-31 23:40:23 +01:00