Commit Graph

172 Commits

Author SHA1 Message Date
bluebread 5a741fda55 mtmd: format code 2025-12-17 03:26:38 +00:00
Saba Fallah 87e4a00c4c minor
- added GLM-4.6V to big tests
- added missing deps for python test
2025-12-16 17:28:46 +01:00
Saba Fallah 00d235700d Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
# Conflicts:
#	src/llama-arch.cpp
2025-12-16 16:45:43 +01:00
Saba Fallah 512b2c8fe4 merge with changes from https://github.com/ggml-org/llama.cpp/pull/18042 2025-12-16 14:07:04 +01:00
Saba Fallah 51c3de6887 Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
# Conflicts:
#	gguf-py/gguf/constants.py
#	gguf-py/gguf/tensor_mapping.py
#	tools/mtmd/clip-impl.h
#	tools/mtmd/clip.cpp
#	tools/mtmd/models/models.h
2025-12-16 12:16:25 +01:00
Xuan-Son Nguyen 7b1db3d3b7
arg: clarify auto kvu/np being set on server (#17997)
* arg: clarify auto kvu/np being set on server

* improve docs

* use invalid_argument
2025-12-16 12:01:27 +01:00
Xuan-Son Nguyen 3d86c6c2b5
model: support GLM4V vision encoder (#18042)
* convert ok

* no deepstack

* less new tensors

* cgraph ok

* add mrope for text model

* faster patch merger

* add GGML_ROPE_TYPE_MRNORM

* add support for metal

* move glm4v do dedicated graph

* convert: add norm_embd

* clip: add debugging fn

* working correctly

* fix style

* use bicubic

* fix mrope metal

* improve cpu

* convert to neox ordering on conversion

* revert backend changes

* force stop if using old weight

* support moe variant

* fix conversion

* fix convert (2)

* Update tools/mtmd/clip-graph.h

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* process mrope_section on TextModel base class

* resolve conflict merge

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-12-16 11:25:26 +01:00
Saba Fallah 4a4f82968c
Merge branch 'ggml-org:master' into sf/deepseek-ocr 2025-12-16 09:09:52 +01:00
Xuan-Son Nguyen 96a181a933
mtmd: refactor audio preprocessing (#17978)
* mtmd: refactor audio preprocessing

* refactor

Co-authored-by: Tarek <tdakhran@users.noreply.github.com>

* wip

* wip (2)

* improve constructor

* fix use_natural_log

* fix padding for short input

* clean up

* remove need_chunking

---------

Co-authored-by: Tarek <tdakhran@users.noreply.github.com>
2025-12-15 14:16:52 +01:00
Saba Fallah 8ad98ee6f5 editorconfig-check fix 2025-12-15 10:40:09 +01:00
Saba Fallah b3bf8cba05 Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
# Conflicts:
#	convert_hf_to_gguf.py
2025-12-15 10:19:50 +01:00
piDack 745fa0e78b
model : add glm-asr support (#17901)
* [model] add glm-asr support

* fix format for ci

* fix convert format for ci

* update glm_asr convert script & use build_ffn for glm_asr clip & use build_stack for padding and review

* check root architecture for convert hf script

* fix conficlt with upstream

* fix convert script for glm asr & format clip-impl

* format

* restore hparams text

* improved conversion

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-12-15 03:18:46 +01:00
Saba Fallah 7f8621c5fb minor formatting 2025-12-14 16:44:23 +01:00
Saba Fallah 3fc61d4814
Merge pull request #10 from sfallah/sf/deepseek-ocr-test-script
python test script for deepseek-ocr
testing OCR on text-1.jpeg newspaper image
checking against expected reference model output for Free-OCR and Markdown
2025-12-14 16:42:27 +01:00
Saba Fallah dc2066e535 check with fixed expected resutls 2025-12-14 16:32:36 +01:00
Haowei Wu 37f5a1093b
mtmd: enhance image resizing in llava_uhd (#18014) 2025-12-14 15:57:52 +01:00
Saba Fallah 6c36c03815 minor formatting fixes 2025-12-14 15:14:32 +01:00
Georgi Gerganov 254098a279
common : refactor common_sampler + grammar logic changes (#17937)
* common : refactor common_sampler + grammar logic changes

* tests : increase max_tokens to get needed response

* batched : fix uninitialized samplers
2025-12-14 10:11:13 +02:00
Saba Fallah fb3bb6aabb added deepseek-ocr test to tests.sh 2025-12-13 17:37:58 +01:00
Saba Fallah f7736f23ef refactoring, one single builder function and static helpers 2025-12-13 17:13:32 +01:00
Saba Fallah f95a6fe9f3 quick and (potential) dirty merge with https://github.com/ggml-org/llama.cpp/pull/17909 2025-12-13 13:52:46 +01:00
Saba Fallah e0e69fd3fb Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr-merge_#17965
# Conflicts:
#	src/llama-kv-cache.cpp
#	tools/mtmd/clip.cpp
2025-12-13 10:59:46 +01:00
Xuan-Son Nguyen e39a2ce66d
clip: move model cgraphs into their own files (#17965)
* clip: move model cgraphs into their own files

* more explicit enums

* fix linux build

* fix naming

* missing headers

* nits: add comments for contributors
2025-12-12 21:14:48 +01:00
Xuan-Son Nguyen 17158965ac
mtmd: explicitly forbidden inclusion of private header and libcommon (#17946) 2025-12-12 15:16:06 +01:00
Saba Fallah 47f0fee6c9 testing deepseek-ocr
quick and dirty test script comparing results of Qwen2.5-VL vs DeepSeek-OCR
2025-12-11 17:00:11 +01:00
Saba Fallah 4cbbe8ab6f minor: editconfig-check fix 2025-12-11 10:32:38 +01:00
Saba Fallah d70f171fac merge with changes from https://github.com/ggml-org/llama.cpp/pull/17909
added new opt to tests.sh to disable flash-attn
2025-12-11 10:11:27 +01:00
Saba Fallah 33fabf0bd8 Merge branch 'master' into sf/deepseek-ocr-merge-test
# Conflicts:
#	tools/mtmd/clip.cpp
#	tools/mtmd/mtmd-cli.cpp
2025-12-11 08:13:50 +01:00
Saba Fallah aaf2fd17bb minor: editconfig-check fix 2025-12-11 07:31:08 +01:00
Xuan-Son Nguyen c6b2c9310c
mtmd: some small clean up (#17909)
* clip: add support for fused qkv in build_vit

* use bulid_ffn whenever possible

* fix internvl

* mtmd-cli: move image to beginning

* test script: support custom args
2025-12-10 22:20:06 +01:00
Xuan-Son Nguyen 34a6d86982
cli: enable jinja by default (#17911)
* cli: enable jinja by default

* Update common/arg.cpp

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2025-12-10 22:19:42 +01:00
Saba Fallah ed944cd25b fix: test-1.jpg ORC issue with small (640) resolution
setting min-resolution base (1024) max large (1280) for dynamic-resolution
2025-12-10 20:20:55 +01:00
Georgi Gerganov 4dff236a52
ggml : remove GGML_KQ_MASK_PAD constant (#17910)
* ggml : remove GGML_KQ_MASK_PAD constant

* cont : remove comment
2025-12-10 20:53:16 +02:00
Xuan-Son Nguyen 6c2131773c
cli: new CLI experience (#17824)
* wip

* wip

* fix logging, add display info

* handle commands

* add args

* wip

* move old cli to llama-completion

* rm deprecation notice

* move server to a shared library

* move ci to llama-completion

* add loading animation

* add --show-timings arg

* add /read command, improve LOG_ERR

* add args for speculative decoding, enable show timings by default

* add arg --image and --audio

* fix windows build

* support reasoning_content

* fix llama2c workflow

* color default is auto

* fix merge conflicts

* properly fix color problem

Co-authored-by: bandoti <bandoti@users.noreply.github.com>

* better loading spinner

* make sure to clean color on force-exit

* also clear input files on "/clear"

* simplify common_log_flush

* add warning in mtmd-cli

* implement console writter

* fix data race

* add attribute

* fix llama-completion and mtmd-cli

* add some notes about console::log

* fix compilation

---------

Co-authored-by: bandoti <bandoti@users.noreply.github.com>
2025-12-10 15:28:59 +01:00
bluebread 016140699f mtmd: remove tweak to llama-mtmd-cli & deepseek-ocr template 2025-12-09 16:31:44 +00:00
Rhys-T 63908b631a
cmake: fix Mach-O current version number (#17877)
PR #17091 set the VERSION of various libraries to 0.0.abcd, where abcd
is the LLAMA_BUILD_NUMBER. That build number is too large to fit in the
Mach-O 'current version' field's 'micro' part, which only goes up to
255. This just sets the Mach-O current version to 0 to get it building
properly again.

Fixes #17258.
2025-12-09 13:17:41 +02:00
bluebread 5174a1e69a mtmd: minor fix 2025-12-08 04:54:19 +00:00
bluebread 48c6cf2132 mtmd: convert model in FP16 2025-12-08 02:36:00 +00:00
bluebread 53273f83f8 mtmd: fixed wrong input setting 2025-12-07 23:58:22 +00:00
bluebread 5dfcc5abb1 mtmd: add detailed comments for resize_bicubic_pillow 2025-12-07 10:15:09 +00:00
bluebread 2d918b3e21 mtmd: make sam hparams configurable 2025-12-06 06:55:53 +00:00
bluebread 15f2ada0ed mtmd: simplify get_rel_pos 2025-12-06 06:32:41 +00:00
Saba Fallah 705394c27a minor editorconfig-check fixes 2025-12-05 13:27:52 +01:00
Saba Fallah d981f19e9d minor editorconfig-check fixes 2025-12-05 13:18:15 +01:00
Saba Fallah 5f2ee1aecf
Merge branch 'ggml-org:master' into sf/deepseek-ocr 2025-12-05 11:56:06 +01:00
Saba Fallah f5bd310a5e minor formatting and style 2025-12-05 09:30:58 +01:00
Saba Fallah 076138a428 corrected code-branch when flash-attn disabled
enabling usage of --flash-attn option
2025-12-04 23:45:59 +01:00
Saba Fallah 5381b9cf63 using common build_attn in sam 2025-12-04 23:13:29 +01:00
bluebread fc3f625fef mtmd: support combined QKV projection in buid_vit 2025-12-04 17:57:43 +00:00
Saba Fallah a661c52990 reverting automatically removed spaces 2025-12-04 16:12:41 +01:00