Yee Man Chan
e3542ff8a2
fixed some comments
2026-01-06 11:35:25 +08:00
Yee Man Chan
cfed14e31b
naive chunking form implemented
2026-01-06 11:23:53 +08:00
Yee Man Chan
aba181ebad
removed LOG_INFO
2026-01-05 19:21:06 +08:00
Yee Man Chan
66c0c5d8d4
Kimi Linear backend agnostic
2026-01-05 16:35:19 +08:00
Yee Man Chan
a0269af292
removed all hard code
2025-12-06 11:51:16 +08:00
Yee Man Chan
9f1265fec1
removed some hard coded code
2025-12-05 19:51:02 +08:00
Yee Man Chan
84f822c5a5
kimi linear convert_hf_to_gguf
2025-12-02 08:51:09 +08:00
Yee Man Chan
27baad43d5
kimi linear model implementation
2025-12-02 08:35:14 +08:00
Xuan-Son Nguyen
cd3c118908
model: support Ministral3 ( #17644 )
...
* conversion script
* support ministral 3
* maybe this is better?
* add TODO for rope_yarn_log_mul
* better ppl (tested on 14B-Instruct)
* Add Ministral3 support to Mistral format
* improve arch handling
* add sizes
* Apply suggestions from code review
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* nits
---------
Co-authored-by: Julien Denize <julien.denize@mistral.ai>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-12-01 12:26:52 +01:00
Piotr Wilkin (ilintar)
ff55414c42
model : Qwen3 Next ( #16095 )
...
* Qwen3 Next - cleaned up version
* Whitespaces and stuff
* Correct minor errors
* Update src/llama-model.cpp
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Misc. fixes.
* Clean up code, add missing hybrid qualifier
* Did someone transpose the SOLVE_TRI result matrix? Perhaps...
* Whitespace
* Proper tensors for cb calls
* Use llama-graph.h vertical alignment
* BROKEN: chunking
* Set new tensors as inputs.
* Proper chunk logic
* It's the circle of life...
* More shenanigans for n_seq > 1
* Nail in the coffin?
* Fix Windows build
* Eh, one fails on Windows, the other fails on Mac... just use general capture.
* quant : cleanup
* model : cleanup
* qwen3 : cleanup
* cont : cleanup
* cont : cleanup
* ggml : revert change
* qwen3 : cleanup
* cont : cleanup
* Readd cmath
* qwen3 : fix typo
* Update convert_hf_to_gguf.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Usual suspects
* fix my bad suggestion
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-11-28 12:02:56 +01:00
Georgi Gerganov
6783b11fb0
models : fix LFM2 tensors ( #17548 )
2025-11-27 16:04:29 +02:00
william pan
4902eebe33
models : Added support for RND1 Diffusion Language Model ( #17433 )
...
* Converted RND1 model to GGUF weights
* RND1 llama.cpp support v1
* RND1 llama.cpp support v2 non causal bug
* RND1 llama.cpp support v3 doccumentation
* RND1 llama.cpp support v4 clean code
* linting issues
* RND1 pr fixes v1
* RND1 pr fixes v2
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Diffusion documentation edits
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-11-24 14:16:56 +08:00
ubergarm
23bc779a6e
model : detect GigaChat3-10-A1.8B as deepseek lite ( #17420 )
...
* Detect GigaChat3-10-A1.8B as deepseek lite
Hardcodes checking number of layers to detect if lite version of deepseek.
* Add commnent identifying deepseek lite variants
deepseek lite variants include DeepSeek-V2-Lite, GigaChat3-10B-A1.8B
2025-11-21 14:51:38 +01:00
Bartowski
e1fcf8b09b
model : add AfmoeForCausalLM support ( #16477 )
...
* Add AFMOE model support
* Update to vocab
* Add model sizing
* Undo Rope change for ARCEE model
* Address review comments
* Update modeling code is_sliding -> use_rope, replace hard-coded logic
* Fix AFMOE tokenizer
* Update convert_hf_to_gguf.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Update convert_hf_to_gguf.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Update AFMoE tokenizer class identification to be more unique
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-11-14 13:54:10 +01:00
Sigbjørn Skjæret
7bef684118
models : move build_inp_out_ids outside loop ( #17151 )
...
* move build_inp_out_ids outside loop
* realign
2025-11-10 22:55:30 +01:00
Sigbjørn Skjæret
9008027aa3
hparams : add n_embd_inp() to support extended embed ( #16928 )
...
* add n_embd_full to support extended embed
* don't change output
* rename to n_embd_inp
* restore n_embd where applicable
2025-11-07 19:27:58 +01:00
Li Pengzhan
9f052478c2
model : add openPangu-Embedded ( #16941 )
...
* Model: add openPangu-Embedded
* fixed according to reviewer's comments
* fixed the chat template check condition
* Apply suggestions from code review
change the chat-template check condition and some formatting issue
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* whitespace cleanup
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-11-05 10:28:58 +01:00
Sigbjørn Skjæret
b164259bba
chore : fix models indent after refactor ( #16992 )
2025-11-04 12:29:15 +01:00
Piotr Wilkin (ilintar)
bea04522ff
refactor : llama-model.cpp ( #16252 )
...
* Sqashed: llama-model.cpp refactoring
* Fix formatting of attn / ffn / ffn_moe calls
* Fix import regression / unify spacing in models.h
* totally DID NOT miss those!
* Add missing qwen3vl(moe) models
* Add missing new .cpp files to build
* Remove extra semicolons
* Editor checker
* Update src/models/models.h
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-10-31 23:40:23 +01:00