Commit Graph

392 Commits

Author SHA1 Message Date
bluebread e20857ba59 mtmd: simplify DeepSeek-OCR dynamic resolution preprocessing 2025-12-03 07:51:12 +00:00
bluebread c914e05405 mtmd: adapt Pillow image resizing function 2025-12-03 05:18:39 +00:00
bluebread 95239f92b9 mtmd: simplify SAM patch embedding 2025-12-01 07:31:24 +00:00
bluebread c5f4c64fe4 mtmd : add --dsocr-mode CLI argument for DeepSeek-OCR resolution control & all native resolution modes work 2025-11-30 16:57:19 +00:00
bluebread 55430945ef Merge branch 'sf/deepseek-ocr' of github.com:sfallah/llama.cpp into sf/deepseek-ocr 2025-11-30 08:55:29 +00:00
Saba Fallah ed3b7f1056 Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
# Conflicts:
#	convert_hf_to_gguf.py
#	src/llama-model.cpp
#	src/models/deepseek2.cpp
2025-11-30 08:29:09 +01:00
Xuan-Son Nguyen ab49f094d2
server: move server-context to its own cpp|h (#17595)
* git mv

* add server-context.h

* add server-context.h

* clean up headers

* cont : cleanup

* also expose server_response_reader (to be used by CLI)

* fix windows build

* decouple server_routes and server_http

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-11-29 22:04:44 +01:00
Haiyue Wang 8c32d9d96d
server: explicitly set the function name in lambda (#17538)
As [1] explained, the real debug message will be like:
	"res    operator(): operator() : queue result stop"

Set the name explicitly, the message is easy for debugging:
	"res    operator(): recv : queue result stop"

The left "operator()" is generated by 'RES_DBG() ... __func__'

[1]: https://clang.llvm.org/extra/clang-tidy/checks/bugprone/lambda-function-name.html

Signed-off-by: Haiyue Wang <haiyuewa@163.com>
2025-11-29 18:43:29 +01:00
bluebread 841a4a88df mtmd: debug CLIP-L & first working DeepSeek-OCR model 2025-11-29 16:40:50 +00:00
Igor Smirnov 0874693b44
common : fix json schema with '\' in literals (#17307)
* Fix json schema with '\' in literals

* Add "literal string with escapes" test
2025-11-29 17:06:32 +01:00
bluebread ccb2f2385e mtmd: debug CLIP-L (vit_pre_ln) 2025-11-29 07:04:14 +00:00
bluebread a488b495f7 mtmd: SAM numerically works 2025-11-29 02:17:49 +00:00
o7si 3ce7a65c2f
server: fix: /metrics endpoint returning JSON-escaped Prometheus format (#17386)
* fix: /metrics endpoint returning JSON-escaped Prometheus format

* mod: remove string overload from ok() method
2025-11-28 19:14:00 +01:00
Fredrik Hultin ddf9f94389
server : add Anthropic Messages API support (#17570)
* server : add Anthropic Messages API support

* remove -@pytest.mark.slow from tool calling/jinja tests

* server : remove unused code and slow/skip on test_anthropic_vision_base64_with_multimodal_model in test_anthropic_api.py

* server : removed redundant n field logic in anthropic_params_from_json

* server : use single error object instead of error_array in streaming response handler for /v1/chat/completions and use unordered_set instead of set in to_json_anthropic_stream()

* server : refactor Anthropic API to use OAI conversion

* make sure basic test always go first

* clean up

* clean up api key check, add test

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2025-11-28 12:57:04 +01:00
Xuan-Son Nguyen e509411cf1
server: enable jinja by default, update docs (#17524)
* server: enable jinja by default, update docs

* fix tests
2025-11-27 01:02:50 +01:00
Han Qingzhe 1d594c295c
clip: (minicpmv) fix resampler kq_scale (#17516)
* debug:"solve minicpmv precision problem"

* “debug minicpmv”

* Apply suggestion from @ngxson

---------

Co-authored-by: Xuan-Son Nguyen <thichthat@gmail.com>
2025-11-26 21:44:07 +01:00
Pascal b1846f1c8e
webui: add rehype plugin to restore HTML in Markdown table cells (#17477)
* webui: add rehype plugin to restore HTML in Markdown table cells

The remark/rehype pipeline neutralizes inline HTML as literal text
(remarkLiteralHtml) so that XML/HTML snippets in LLM responses display
as-is instead of being rendered. This causes <br> and <ul> markup in
table cells to show as plain text.

This plugin traverses the HAST post-conversion, parses whitelisted HTML
patterns (<br>, <ul><li>) from text nodes, and replaces them with actual
HAST element nodes. For lists, adjacent siblings must be combined first
as the AST fragmentation breaks pattern matching.

Strict validation rejects malformed markup, keeping it as raw text.

* chore: update webui build output
2025-11-25 08:01:02 +01:00
Xuan-Son Nguyen b8372eecd9
server: split server.cpp code into server/common/task/queue (#17362)
* add server-task, server-common

* add server-queue

* rm redundant includes

* move enum stop_type to server-task

* server : headers cleanup

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-11-24 14:41:53 +01:00
bluebread 81533e494e mtmd: fix danling pointer 2025-11-24 09:02:03 +00:00
bluebread 40e7e6e706 mtmd: quick fix token order 2025-11-24 08:16:32 +00:00
Saba Fallah 206f8abc3c - dynamic resizing
- changes are concerning PR https://github.com/sfallah/llama.cpp/pull/4
2025-11-23 20:27:02 +01:00
Pascal 0c7220db56
webui: minor settings reorganization and add disable autoscroll option (#17452)
* webui: added a dedicated 'Display' settings section that groups visualization options

* webui: added a Display setting to toggle automatic chat scrolling

* chore: update webui build output
2025-11-23 18:42:00 +01:00
Saba Fallah 6dfda99c69
Merge branch 'sf/deepseek-ocr' into sf/deepseek-ocr 2025-11-23 12:29:37 +01:00
bluebread 3f71188303 mtmd: correct token order 2025-11-23 09:22:00 +00:00
Saba Fallah 4cfa15fcd7 - image encoding debugged
- issues fixed mainly related wrong config like n_patches etc.
- configs need to be corrected in the converter
2025-11-22 16:57:34 +01:00
bluebread ee8a1488f9 mtmd: add native resolution support 2025-11-22 15:48:13 +00:00
Saba Fallah 3fcfc3ace9
Merge pull request #3 from bluebread/sf/deepseek-ocr
Fixed get_rel_pos & add_rel_pos_inplace operator
2025-11-22 09:33:15 +01:00
bluebread f8f66a151b Merge branch 'sf/deepseek-ocr' of github.com:sfallah/llama.cpp into sf/deepseek-ocr 2025-11-22 02:22:48 +00:00
bluebread effe66958e mtmd: minor changed 2025-11-22 02:09:37 +00:00
Saba Fallah 86f111f8b7 image encoding technically works but the output can't be checked singe image decoding fails 2025-11-21 20:42:14 +01:00
bluebread 7b8d735c90 mtmd: fixed the wrong scaler for get_rel_pos 2025-11-21 18:04:01 +00:00
bluebread 7e9fbeccc5 mtmd: fix get_rel_pos 2025-11-21 17:12:12 +00:00
bluebread 5e6cf3c6a8 Merge branch 'sf/deepseek-ocr' of github.com:sfallah/llama.cpp into sf/deepseek-ocr 2025-11-21 15:36:45 +00:00
bluebread 8bce66d5f2 clip: fixed warnings 2025-11-21 15:28:37 +00:00
Saba Fallah 68b206b65c sam implementation without using CPU only ops 2025-11-21 15:29:39 +01:00
Aleksander Grygier 4c91f2633f
Improved file naming & structure for UI components (#17405)
* refactor: Component iles naming & structure

* chore: update webui build output

* refactor: Dialog titles + components namig

* chore: update webui build output

* refactor: Imports

* chore: update webui build output
2025-11-20 14:07:31 +01:00
Georgi Gerganov 196f5083ef
common : more accurate sampling timing (#17382)
* common : more accurate sampling timing

* eval-callback : minor fixes

* cont : add time_meas impl

* cont : fix log msg [no ci]

* cont : fix multiple definitions of time_meas

* llama-cli : exclude chat template init from time measurement

* cont : print percentage of unaccounted time

* cont : do not reset timings
2025-11-20 13:40:10 +02:00
Saba Fallah 88032f46b1 window partitioning using standard ggml ops 2025-11-20 10:07:54 +01:00
Aleksander Grygier 99c53d6558
webui: Add a "Continue" Action for Assistant Message (#16971)
* feat: Add "Continue" action for assistant messages

* feat: Continuation logic & prompt improvements

* chore: update webui build output

* feat: Improve logic for continuing the assistant message

* chore: update webui build output

* chore: Linting

* chore: update webui build output

* fix: Remove synthetic prompt logic, use the prefill feature by sending the conversation payload ending with assistant message

* chore: update webui build output

* feat: Enable "Continue" button based on config & non-reasoning model type

* chore: update webui build output

* chore: Update packages with `npm audit fix`

* fix: Remove redundant error

* chore: update webui build output

* chore: Update `.gitignore`

* fix: Add missing change

* feat: Add auto-resizing for Edit Assistant/User Message textareas

* chore: update webui build output
2025-11-19 14:39:50 +01:00
Saba Fallah 89afda8da9 visual_model warmup (technically) works 2025-11-18 10:26:32 +01:00
Saba Fallah 63a042f21e concat image_newline and image_seperator tokens 2025-11-18 09:43:11 +01:00
o7si 97cb3fd5ae
fix: resolve undefined variable 'svr' compilation error (#17348) 2025-11-18 10:10:47 +02:00
Saba Fallah 331cea8f8e corrected combining of image encoders' results 2025-11-18 05:59:37 +01:00
Xuan-Son Nguyen 0de8878c96
server: split HTTP into its own interface (#17216)
* server: split HTTP into its own interface

* move server-http and httplib to its own file

* add the remaining endpoints

* fix exception/error handling

* renaming

* missing header

* fix missing windows header

* fix error responses from http layer

* fix slot save/restore handler

* fix case where only one stream chunk is returned

* add NOMINMAX

* do not call sink.write on empty data

* use safe_json_to_str for SSE

* clean up

* add some comments

* improve usage of next()

* bring back the "server is listening on" message

* more generic handler

* add req.headers

* move the chat template print to init()

* add req.path

* cont : minor

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-11-17 22:05:44 +01:00
Saba Fallah 8b3d319c03 clip-vit: corrected cls_embd concat 2025-11-17 20:57:51 +01:00
Saba Fallah cec9a5c6e0 sam erroneous return corrected 2025-11-17 18:59:40 +01:00
Saba Fallah 790bbb97d8 sam warmup working 2025-11-17 15:27:00 +01:00
Saba Fallah 97e0907c5b loading LM
testing Vision model loading
2025-11-17 11:07:33 +01:00
Georgi Gerganov 5b2093becc
server : handle context overflow during decode (#17267)
* server : handle context overflow during decode

* server : minor refactor
2025-11-16 09:23:37 +02:00
Aleksander Grygier 22e1ce2f81
webui: Fix clickability around chat processing statistics UI (#17278)
* fix: Better pointer events handling in chat processing info elements

* chore: update webui build output
2025-11-15 22:41:41 +01:00