bluebread
841a4a88df
mtmd: debug CLIP-L & first working DeepSeek-OCR model
2025-11-29 16:40:50 +00:00
bluebread
ccb2f2385e
mtmd: debug CLIP-L (vit_pre_ln)
2025-11-29 07:04:14 +00:00
bluebread
a488b495f7
mtmd: SAM numerically works
2025-11-29 02:17:49 +00:00
bluebread
81533e494e
mtmd: fix danling pointer
2025-11-24 09:02:03 +00:00
bluebread
40e7e6e706
mtmd: quick fix token order
2025-11-24 08:16:32 +00:00
Saba Fallah
206f8abc3c
- dynamic resizing
...
- changes are concerning PR https://github.com/sfallah/llama.cpp/pull/4
2025-11-23 20:27:02 +01:00
Saba Fallah
6dfda99c69
Merge branch 'sf/deepseek-ocr' into sf/deepseek-ocr
2025-11-23 12:29:37 +01:00
bluebread
3f71188303
mtmd: correct token order
2025-11-23 09:22:00 +00:00
Saba Fallah
4cfa15fcd7
- image encoding debugged
...
- issues fixed mainly related wrong config like n_patches etc.
- configs need to be corrected in the converter
2025-11-22 16:57:34 +01:00
bluebread
ee8a1488f9
mtmd: add native resolution support
2025-11-22 15:48:13 +00:00
Saba Fallah
3fcfc3ace9
Merge pull request #3 from bluebread/sf/deepseek-ocr
...
Fixed get_rel_pos & add_rel_pos_inplace operator
2025-11-22 09:33:15 +01:00
bluebread
f8f66a151b
Merge branch 'sf/deepseek-ocr' of github.com:sfallah/llama.cpp into sf/deepseek-ocr
2025-11-22 02:22:48 +00:00
bluebread
effe66958e
mtmd: minor changed
2025-11-22 02:09:37 +00:00
Saba Fallah
86f111f8b7
image encoding technically works but the output can't be checked singe image decoding fails
2025-11-21 20:42:14 +01:00
bluebread
7b8d735c90
mtmd: fixed the wrong scaler for get_rel_pos
2025-11-21 18:04:01 +00:00
bluebread
7e9fbeccc5
mtmd: fix get_rel_pos
2025-11-21 17:12:12 +00:00
bluebread
5e6cf3c6a8
Merge branch 'sf/deepseek-ocr' of github.com:sfallah/llama.cpp into sf/deepseek-ocr
2025-11-21 15:36:45 +00:00
bluebread
8bce66d5f2
clip: fixed warnings
2025-11-21 15:28:37 +00:00
Saba Fallah
68b206b65c
sam implementation without using CPU only ops
2025-11-21 15:29:39 +01:00
Saba Fallah
88032f46b1
window partitioning using standard ggml ops
2025-11-20 10:07:54 +01:00
Saba Fallah
89afda8da9
visual_model warmup (technically) works
2025-11-18 10:26:32 +01:00
Saba Fallah
63a042f21e
concat image_newline and image_seperator tokens
2025-11-18 09:43:11 +01:00
Saba Fallah
331cea8f8e
corrected combining of image encoders' results
2025-11-18 05:59:37 +01:00
Saba Fallah
8b3d319c03
clip-vit: corrected cls_embd concat
2025-11-17 20:57:51 +01:00
Saba Fallah
cec9a5c6e0
sam erroneous return corrected
2025-11-17 18:59:40 +01:00
Saba Fallah
790bbb97d8
sam warmup working
2025-11-17 15:27:00 +01:00
Saba Fallah
97e0907c5b
loading LM
...
testing Vision model loading
2025-11-17 11:07:33 +01:00
Saba Fallah
2aab52e2c4
deepseek-ocr clip-vit model impl
2025-11-15 15:30:07 +01:00
Saba Fallah
b6b9f02c8a
loading sam tensors
2025-11-14 20:51:48 +01:00
Saba Fallah
43a130b4d0
mtmd: llama.cpp DeepSeekOCR support
...
init commit
2025-11-14 12:40:20 +01:00
chansikpark
333f2595a3
webui: fix keyboard shortcuts for new chat & edit chat title ( #17007 )
2025-11-08 20:52:35 +01:00
Aidan
eeee367de5
server: fix correct time_ms calculation in prompt_progress ( #17093 )
...
* fix: correct time_ms calculation in send_partial_response
The time_ms field was incorrectly calculated. The division was happening
before the subtraction leading to incorrect values.
Before: (ggml_time_us() - slot.t_start_process_prompt / 1000) After:
(ggml_time_us() - slot.t_start_process_prompt) / 1000
* docs : document time_ms field in prompt_progress
2025-11-08 15:12:11 +02:00
Georgi Gerganov
7956bb4d7f
bench : cache the llama_context state at computed depth ( #16944 )
...
* bench : cache llama_context state at depth
* cont : handle failures to restore the old state
* cont : print information when the state is being reused
2025-11-07 21:23:11 +02:00
Sigbjørn Skjæret
9008027aa3
hparams : add n_embd_inp() to support extended embed ( #16928 )
...
* add n_embd_full to support extended embed
* don't change output
* rename to n_embd_inp
* restore n_embd where applicable
2025-11-07 19:27:58 +01:00
Georgi Gerganov
16bcc1259d
kv-cache : pad the cache size to 256 for performance ( #17046 )
...
* kv-cache : pad the size of the small SWA cache for performance
* context : pad the total context to 256
* cont : future-proof the swa pad
* server : adjust test params to new logic
2025-11-07 20:03:25 +02:00
Georgi Gerganov
8c0d6bb455
server : print the samplers chain for each request ( #17070 )
2025-11-07 12:24:47 +02:00
Georgi Gerganov
b7f9010d24
server : disable checkpoints with mtmd ( #17045 )
2025-11-06 12:09:29 +02:00
Xuan-Son Nguyen
4882f0ff78
clip: implement minicpm-v sinusoidal embd using GGML ( #17036 )
...
* clip: implement minicpm-v sinusoidal embd using GGML
* fix repeat op
2025-11-06 11:02:54 +01:00
Xuan-Son Nguyen
92bb84f775
mtmd: allow QwenVL to process larger image by default ( #17020 )
2025-11-05 14:26:49 +01:00
Georgi Gerganov
13b339bcd9
server : do not default to multiple slots with speculative decoding ( #17017 )
...
* server : do not default to multiple slots with speculative decoding
* cont : fix
2025-11-05 14:32:55 +02:00
Xuan-Son Nguyen
2f0c2db43e
mtmd: improve struct initialization ( #16981 )
2025-11-05 11:26:37 +01:00
손희준
fd2f84f468
docs: Clarify the endpoint that webui uses ( #17001 )
2025-11-05 11:20:28 +01:00
Georgi Gerganov
66d8eccd42
server : do context shift only while generating ( #17000 )
2025-11-04 19:21:36 +02:00
Aleksander Grygier
e7da30b584
fix: Viewing multiple PDF attachments ( #16974 )
2025-11-03 18:53:26 +01:00
Georgi Gerganov
48bd26501b
server : add props.model_alias ( #16943 )
...
* server : add props.model_alias
* webui : npm run format
2025-11-03 14:38:23 +01:00
Xuan-Son Nguyen
070ff4d535
mtmd: add --image-min/max-tokens ( #16921 )
2025-11-03 11:11:18 +01:00
Xuan-Son Nguyen
bf7b0c9725
mtmd: pad mask for qwen2.5vl ( #16954 )
...
* mtmd: pad mask for qwen2.5vl
* improve
2025-11-03 10:25:55 +01:00
Sascha Rogmann
bcfa87622a
feat(webui): improve LaTeX rendering with currency detection ( #16508 )
...
* webui : Revised LaTeX formula recognition
* webui : Further examples containg amounts
* webui : vitest for maskInlineLaTeX
* webui: Moved preprocessLaTeX to lib/utils
* webui: LaTeX in table-cells
* chore: update webui build output (use theirs)
* webui: backslash in LaTeX-preprocessing
* chore: update webui build output
* webui: look-behind backslash-check
* chore: update webui build output
* Apply suggestions from code review
Code maintenance (variable names, code formatting, string handling)
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
* webui: Moved constants to lib/constants.
* webui: package woff2 inside base64 data
* webui: LaTeX-line-break in display formula
* chore: update webui build output
* webui: Bugfix (font embedding)
* webui: Bugfix (font embedding)
* webui: vite embeds assets
* webui: don't suppress 404 (fonts)
* refactor: KaTeX integration with SCSS
Moves KaTeX styling to SCSS for better customization and font embedding.
This change includes:
- Adding `sass` as a dev dependency.
- Introducing a custom SCSS file to override KaTeX variables and disable TTF/WOFF fonts, relying solely on WOFF2 for embedding.
- Adjusting the Vite configuration to resolve `katex-fonts` alias and inject SCSS variables.
* fix: LaTeX processing within blockquotes
* webui: update webui build output
---------
Co-authored-by: Aleksander Grygier <aleksander.grygier@gmail.com>
2025-11-03 00:41:08 +01:00
Zhiyong Wang
6b9a52422b
model: add Janus Pro for image understanding ( #16906 )
...
* Add support for Janus Pro
* Update gguf-py/gguf/tensor_mapping.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Update gguf-py/gguf/tensor_mapping.py
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Address reviewer suggestions
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Add JANUS_PRO constant
* Update clip model handling
Co-authored-by: Xuan-Son Nguyen <son@huggingface.co>
* Update tools/mtmd/clip.cpp
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>
* Refactor JANUS_PRO handling in clip.cpp
Co-authored-by: Xuan-Son Nguyen <son@huggingface.co>
* Update tools/mtmd/clip.cpp
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* em whitespace
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
Co-authored-by: Xuan-Son Nguyen <son@huggingface.co>
Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>
2025-11-02 22:08:04 +01:00
Georgi Gerganov
2f966b8ed8
clip : use FA ( #16837 )
...
* clip : use FA
* cont : add warning about unsupported ops
* implement "auto" mode for clip flash attn
* clip : print more detailed op support info during warmup
* cont : remove obsolete comment [no ci]
* improve debugging message
* trailing space
* metal : remove stray return
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2025-11-02 21:21:48 +01:00