bluebread
d0c08e36a5
mtmd: minor fix
2025-12-05 04:03:56 +00:00
Saba Fallah
66341666fb
Merge branch 'master' into sf/deepseek-ocr
...
# Conflicts:
# convert_hf_to_gguf.py
# tools/mtmd/clip.h
# tools/mtmd/mtmd.cpp
2025-12-02 21:02:13 +01:00
Xuan-Son Nguyen
cd3c118908
model: support Ministral3 ( #17644 )
...
* conversion script
* support ministral 3
* maybe this is better?
* add TODO for rope_yarn_log_mul
* better ppl (tested on 14B-Instruct)
* Add Ministral3 support to Mistral format
* improve arch handling
* add sizes
* Apply suggestions from code review
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* nits
---------
Co-authored-by: Julien Denize <julien.denize@mistral.ai>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-12-01 12:26:52 +01:00
Saba Fallah
ed3b7f1056
Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
...
# Conflicts:
# convert_hf_to_gguf.py
# src/llama-model.cpp
# src/models/deepseek2.cpp
2025-11-30 08:29:09 +01:00
Piotr Wilkin (ilintar)
ff55414c42
model : Qwen3 Next ( #16095 )
...
* Qwen3 Next - cleaned up version
* Whitespaces and stuff
* Correct minor errors
* Update src/llama-model.cpp
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Misc. fixes.
* Clean up code, add missing hybrid qualifier
* Did someone transpose the SOLVE_TRI result matrix? Perhaps...
* Whitespace
* Proper tensors for cb calls
* Use llama-graph.h vertical alignment
* BROKEN: chunking
* Set new tensors as inputs.
* Proper chunk logic
* It's the circle of life...
* More shenanigans for n_seq > 1
* Nail in the coffin?
* Fix Windows build
* Eh, one fails on Windows, the other fails on Mac... just use general capture.
* quant : cleanup
* model : cleanup
* qwen3 : cleanup
* cont : cleanup
* cont : cleanup
* ggml : revert change
* qwen3 : cleanup
* cont : cleanup
* Readd cmath
* qwen3 : fix typo
* Update convert_hf_to_gguf.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Usual suspects
* fix my bad suggestion
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-11-28 12:02:56 +01:00
Georgi Gerganov
6783b11fb0
models : fix LFM2 tensors ( #17548 )
2025-11-27 16:04:29 +02:00
william pan
4902eebe33
models : Added support for RND1 Diffusion Language Model ( #17433 )
...
* Converted RND1 model to GGUF weights
* RND1 llama.cpp support v1
* RND1 llama.cpp support v2 non causal bug
* RND1 llama.cpp support v3 doccumentation
* RND1 llama.cpp support v4 clean code
* linting issues
* RND1 pr fixes v1
* RND1 pr fixes v2
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Diffusion documentation edits
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-11-24 14:16:56 +08:00
ubergarm
23bc779a6e
model : detect GigaChat3-10-A1.8B as deepseek lite ( #17420 )
...
* Detect GigaChat3-10-A1.8B as deepseek lite
Hardcodes checking number of layers to detect if lite version of deepseek.
* Add commnent identifying deepseek lite variants
deepseek lite variants include DeepSeek-V2-Lite, GigaChat3-10B-A1.8B
2025-11-21 14:51:38 +01:00
bluebread
6c0715befc
fix: update callback for ffn_moe_weighted and add callback for attn_out in deepseek2 model
2025-11-18 06:19:38 +00:00
bluebread
2de3436705
mtmd: Fix RoPE type for DeepSeek-OCR LM.
2025-11-17 08:44:29 +00:00
bluebread
76305878d5
mtmd: successfully runs DeepSeek-OCR LM in llama-cli
2025-11-16 08:45:08 +00:00
bluebread
eab28ed318
mtmd: add DeepSeek-OCR LM support with standard attention
2025-11-15 17:28:18 +00:00
Bartowski
e1fcf8b09b
model : add AfmoeForCausalLM support ( #16477 )
...
* Add AFMOE model support
* Update to vocab
* Add model sizing
* Undo Rope change for ARCEE model
* Address review comments
* Update modeling code is_sliding -> use_rope, replace hard-coded logic
* Fix AFMOE tokenizer
* Update convert_hf_to_gguf.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Update convert_hf_to_gguf.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Update AFMoE tokenizer class identification to be more unique
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-11-14 13:54:10 +01:00
Sigbjørn Skjæret
7bef684118
models : move build_inp_out_ids outside loop ( #17151 )
...
* move build_inp_out_ids outside loop
* realign
2025-11-10 22:55:30 +01:00
Sigbjørn Skjæret
9008027aa3
hparams : add n_embd_inp() to support extended embed ( #16928 )
...
* add n_embd_full to support extended embed
* don't change output
* rename to n_embd_inp
* restore n_embd where applicable
2025-11-07 19:27:58 +01:00
Li Pengzhan
9f052478c2
model : add openPangu-Embedded ( #16941 )
...
* Model: add openPangu-Embedded
* fixed according to reviewer's comments
* fixed the chat template check condition
* Apply suggestions from code review
change the chat-template check condition and some formatting issue
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* whitespace cleanup
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-11-05 10:28:58 +01:00
Sigbjørn Skjæret
b164259bba
chore : fix models indent after refactor ( #16992 )
2025-11-04 12:29:15 +01:00
Piotr Wilkin (ilintar)
bea04522ff
refactor : llama-model.cpp ( #16252 )
...
* Sqashed: llama-model.cpp refactoring
* Fix formatting of attn / ffn / ffn_moe calls
* Fix import regression / unify spacing in models.h
* totally DID NOT miss those!
* Add missing qwen3vl(moe) models
* Add missing new .cpp files to build
* Remove extra semicolons
* Editor checker
* Update src/models/models.h
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-10-31 23:40:23 +01:00