llama.cpp/tools/mtmd/models
Richard Davison af76639f72
model : add HunyuanOCR support (#21395)
* HunyuanOCR: add support for text and vision models

- Add HunyuanOCR vision projector (perceiver-based) with Conv2d merge
- Add separate HUNYUAN_OCR chat template (content-before-role format)
- Handle HunyuanOCR's invalid pad_token_id=-1 in converter
- Fix EOS/EOT token IDs from generation_config.json
- Support xdrope RoPE scaling type
- Add tensor mappings for perceiver projector (mm.before_rms, mm.after_rms, etc.)
- Register HunYuanVLForConditionalGeneration for both text and mmproj conversion

* fix proper mapping

* Update gguf-py/gguf/tensor_mapping.py

Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>

* Update tools/mtmd/clip.cpp

Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>

* address comments

* update

* Fix typecheck

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update convert_hf_to_gguf.py

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

---------

Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-04-05 23:32:14 +02:00
..
cogvlm.cpp mtmd: add clip_graph::build_mm() (#20751) 2026-03-19 13:11:39 +01:00
conformer.cpp mtmd: add clip_graph::build_mm() (#20751) 2026-03-19 13:11:39 +01:00
deepseekocr.cpp mtmd: Add DeepSeekOCR Support (#17400) 2026-03-25 19:57:40 +01:00
gemma4v.cpp model, mtmd: fix gguf conversion for audio/vision mmproj (#21309) 2026-04-02 17:10:32 +02:00
glm4v.cpp mtmd: Add DeepSeekOCR Support (#17400) 2026-03-25 19:57:40 +01:00
hunyuanocr.cpp model : add HunyuanOCR support (#21395) 2026-04-05 23:32:14 +02:00
internvl.cpp clip: move model cgraphs into their own files (#17965) 2025-12-12 21:14:48 +01:00
kimik25.cpp model: Add Kimi-K2.5 support (#19170) 2026-02-11 16:47:30 +01:00
kimivl.cpp clip: move model cgraphs into their own files (#17965) 2025-12-12 21:14:48 +01:00
llama4.cpp mtmd: add clip_graph::build_mm() (#20751) 2026-03-19 13:11:39 +01:00
llava.cpp mtmd: add clip_graph::build_mm() (#20751) 2026-03-19 13:11:39 +01:00
minicpmv.cpp mtmd: add clip_graph::build_mm() (#20751) 2026-03-19 13:11:39 +01:00
mobilenetv5.cpp mtmd: add clip_graph::build_mm() (#20751) 2026-03-19 13:11:39 +01:00
models.h model : add HunyuanOCR support (#21395) 2026-04-05 23:32:14 +02:00
nemotron-v2-vl.cpp mtmd : Add Nemotron Nano 12B v2 VL support (#19547) 2026-02-14 14:07:00 +01:00
paddleocr.cpp model: Add PaddleOCR-VL model support (#18825) 2026-02-19 17:05:25 +01:00
pixtral.cpp mtmd: add clip_graph::build_mm() (#20751) 2026-03-19 13:11:39 +01:00
qwen2vl.cpp mtmd: add clip_graph::build_mm() (#20751) 2026-03-19 13:11:39 +01:00
qwen3vl.cpp mtmd: add clip_graph::build_mm() (#20751) 2026-03-19 13:11:39 +01:00
siglip.cpp mtmd: Add DeepSeekOCR Support (#17400) 2026-03-25 19:57:40 +01:00
whisper-enc.cpp mtmd: add clip_graph::build_mm() (#20751) 2026-03-19 13:11:39 +01:00
youtuvl.cpp mtmd: add clip_graph::build_mm() (#20751) 2026-03-19 13:11:39 +01:00