Saba Fallah
a661c52990
reverting automatically removed spaces
2025-12-04 16:12:41 +01:00
Saba Fallah
c73748ab5d
Merge branch 'sf/deepseek-ocr' into sf/deepseek-ocr-cleanup
...
# Conflicts:
# gguf-py/gguf/tensor_mapping.py
2025-12-04 15:09:32 +01:00
Saba Fallah
386ba479a2
clean up
2025-12-04 15:05:58 +01:00
bluebread
7451b84105
mtmd: fix tensor names for image newlines and view separator
2025-12-04 13:26:53 +00:00
bluebread
b26b507c4e
mtmd: refactor code & remove unused helper functions
2025-12-03 16:23:46 +00:00
bluebread
b696c54756
mtmd: remove --dsocr-mode argument
2025-12-03 14:54:16 +00:00
bluebread
43dfc0c8d6
Merge branch 'sf/deepseek-ocr' of github.com:sfallah/llama.cpp into sf/deepseek-ocr
2025-12-03 07:52:26 +00:00
bluebread
e20857ba59
mtmd: simplify DeepSeek-OCR dynamic resolution preprocessing
2025-12-03 07:51:12 +00:00
bluebread
c914e05405
mtmd: adapt Pillow image resizing function
2025-12-03 05:18:39 +00:00
Saba Fallah
66341666fb
Merge branch 'master' into sf/deepseek-ocr
...
# Conflicts:
# convert_hf_to_gguf.py
# tools/mtmd/clip.h
# tools/mtmd/mtmd.cpp
2025-12-02 21:02:13 +01:00
Xuan-Son Nguyen
ecf74a8417
mtmd: add mtmd_context_params::warmup option ( #17652 )
...
* mtmd: add mtmd_context_params::warmup option
* reuse the common_params::warmup
2025-12-01 21:32:25 +01:00
bluebread
95239f92b9
mtmd: simplify SAM patch embedding
2025-12-01 07:31:24 +00:00
Tarek Dakhran
2ba719519d
model: LFM2-VL fixes ( #17577 )
...
* Adjust to pytorch
* Add antialiasing upscale
* Increase number of patches to 1024
* Handle default marker insertion for LFM2
* Switch to flag
* Reformat
* Cuda implementation of antialias kernel
* Change placement in ops.cpp
* consistent float literals
* Pad only for LFM2
* Address PR feedback
* Rollback default marker placement changes
* Fallback to CPU implementation for antialias implementation of upscale
2025-11-30 21:57:31 +01:00
bluebread
c5f4c64fe4
mtmd : add --dsocr-mode CLI argument for DeepSeek-OCR resolution control & all native resolution modes work
2025-11-30 16:57:19 +00:00
Xuan-Son Nguyen
7f8ef50cce
clip: fix nb calculation for qwen3-vl ( #17594 )
2025-11-30 15:33:55 +01:00
bluebread
55430945ef
Merge branch 'sf/deepseek-ocr' of github.com:sfallah/llama.cpp into sf/deepseek-ocr
2025-11-30 08:55:29 +00:00
Saba Fallah
ed3b7f1056
Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
...
# Conflicts:
# convert_hf_to_gguf.py
# src/llama-model.cpp
# src/models/deepseek2.cpp
2025-11-30 08:29:09 +01:00
bluebread
841a4a88df
mtmd: debug CLIP-L & first working DeepSeek-OCR model
2025-11-29 16:40:50 +00:00
bluebread
ccb2f2385e
mtmd: debug CLIP-L (vit_pre_ln)
2025-11-29 07:04:14 +00:00
bluebread
a488b495f7
mtmd: SAM numerically works
2025-11-29 02:17:49 +00:00
Han Qingzhe
1d594c295c
clip: (minicpmv) fix resampler kq_scale ( #17516 )
...
* debug:"solve minicpmv precision problem"
* “debug minicpmv”
* Apply suggestion from @ngxson
---------
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>
2025-11-26 21:44:07 +01:00
bluebread
81533e494e
mtmd: fix danling pointer
2025-11-24 09:02:03 +00:00
bluebread
40e7e6e706
mtmd: quick fix token order
2025-11-24 08:16:32 +00:00
Saba Fallah
206f8abc3c
- dynamic resizing
...
- changes are concerning PR https://github.com/sfallah/llama.cpp/pull/4
2025-11-23 20:27:02 +01:00
Saba Fallah
6dfda99c69
Merge branch 'sf/deepseek-ocr' into sf/deepseek-ocr
2025-11-23 12:29:37 +01:00
bluebread
3f71188303
mtmd: correct token order
2025-11-23 09:22:00 +00:00
Saba Fallah
4cfa15fcd7
- image encoding debugged
...
- issues fixed mainly related wrong config like n_patches etc.
- configs need to be corrected in the converter
2025-11-22 16:57:34 +01:00
bluebread
ee8a1488f9
mtmd: add native resolution support
2025-11-22 15:48:13 +00:00
Saba Fallah
3fcfc3ace9
Merge pull request #3 from bluebread/sf/deepseek-ocr
...
Fixed get_rel_pos & add_rel_pos_inplace operator
2025-11-22 09:33:15 +01:00
bluebread
f8f66a151b
Merge branch 'sf/deepseek-ocr' of github.com:sfallah/llama.cpp into sf/deepseek-ocr
2025-11-22 02:22:48 +00:00
bluebread
effe66958e
mtmd: minor changed
2025-11-22 02:09:37 +00:00
Saba Fallah
86f111f8b7
image encoding technically works but the output can't be checked singe image decoding fails
2025-11-21 20:42:14 +01:00
bluebread
7b8d735c90
mtmd: fixed the wrong scaler for get_rel_pos
2025-11-21 18:04:01 +00:00
bluebread
7e9fbeccc5
mtmd: fix get_rel_pos
2025-11-21 17:12:12 +00:00
bluebread
5e6cf3c6a8
Merge branch 'sf/deepseek-ocr' of github.com:sfallah/llama.cpp into sf/deepseek-ocr
2025-11-21 15:36:45 +00:00
bluebread
8bce66d5f2
clip: fixed warnings
2025-11-21 15:28:37 +00:00
Saba Fallah
68b206b65c
sam implementation without using CPU only ops
2025-11-21 15:29:39 +01:00
Saba Fallah
88032f46b1
window partitioning using standard ggml ops
2025-11-20 10:07:54 +01:00
Saba Fallah
89afda8da9
visual_model warmup (technically) works
2025-11-18 10:26:32 +01:00
Saba Fallah
63a042f21e
concat image_newline and image_seperator tokens
2025-11-18 09:43:11 +01:00
Saba Fallah
331cea8f8e
corrected combining of image encoders' results
2025-11-18 05:59:37 +01:00
Saba Fallah
8b3d319c03
clip-vit: corrected cls_embd concat
2025-11-17 20:57:51 +01:00
Saba Fallah
cec9a5c6e0
sam erroneous return corrected
2025-11-17 18:59:40 +01:00
Saba Fallah
790bbb97d8
sam warmup working
2025-11-17 15:27:00 +01:00
Saba Fallah
97e0907c5b
loading LM
...
testing Vision model loading
2025-11-17 11:07:33 +01:00
Saba Fallah
2aab52e2c4
deepseek-ocr clip-vit model impl
2025-11-15 15:30:07 +01:00
Ankur Verma
c7b7db0445
mtmd-cli: Avoid logging to stdout for model loading messages in mtmd-cli ( #17277 )
2025-11-15 12:41:16 +01:00
Saba Fallah
b6b9f02c8a
loading sam tensors
2025-11-14 20:51:48 +01:00
Xuan-Son Nguyen
9b17d74ab7
mtmd: add mtmd_log_set ( #17268 )
2025-11-14 15:56:19 +01:00
Saba Fallah
43a130b4d0
mtmd: llama.cpp DeepSeekOCR support
...
init commit
2025-11-14 12:40:20 +01:00