llama.cpp/common
Tarek Dakhran c5897995a7
mtmd : chat : Fix extra \n between text and media marker (#19595)
* mtmd : chat : Fix extra \n between text and media marker

Thanks to @tugot17 for detecting and reporting the issue.

For vision models (e.g. LFM2.5-VL-1.6B and Qwen/Qwen3-VL-4B-Instruct) `llama-mtmd-cli` produces identical output to HF implementation.

However `llama-server` doesn't. I traced it down to extra newline
inserted after `<__media__>`.

This happens in `to_json_oaicompat`, that treats media markers as text
and joins all parts with `\n` separator.

PR introduces new type `media_marker` and uses it for media markers.
Extra logic is added to prevent insertion of newlines before and after
media markers.

With this change number of input tokens is identical to HF
implementation and as a result the output is also identical.

I explored other ways to address the issue
* remove completely `\n` between text parts in `to_json_oaicompat`
* merge text messages in server-common.cpp before sending them to `to_json_oaicompat`

Please propose alternative ways of fixing this issue.

* Refactor to use explicite per type ifs

* Update common/chat.cpp

Co-authored-by: Piotr Wilkin (ilintar) <piotr.wilkin@syndatis.com>

* Update common_chat_templates_apply_legacy

---------

Co-authored-by: Piotr Wilkin (ilintar) <piotr.wilkin@syndatis.com>
2026-02-19 12:18:57 +01:00
..
jinja Add Jinja support for "indent" string filter (#19529) 2026-02-19 00:25:52 +01:00
CMakeLists.txt build : cleanup library linking logic (#19665) 2026-02-17 08:36:45 +01:00
arg.cpp args : add -kvu to llama-parallel (#19577) 2026-02-12 21:52:41 +02:00
arg.h vendor : update cpp-httplib to 0.30.0 (#18660) 2026-01-08 13:53:54 +01:00
base64.hpp llava : expose as a shared library for downstream projects (#3613) 2023-11-07 00:36:23 +03:00
build-info.cpp.in cmake: Add ability to pass in LLAMA_BUILD_NUMBER/COMMIT (#14167) 2025-06-13 10:38:52 +02:00
chat-parser-xml-toolcall.cpp Fix Kimi-K2 tool-call parsing issues (#17376) 2025-12-08 14:32:04 +01:00
chat-parser-xml-toolcall.h Fix Kimi-K2 tool-call parsing issues (#17376) 2025-12-08 14:32:04 +01:00
chat-parser.cpp server : support preserving reasoning_content in assistant message (#18994) 2026-01-22 21:30:06 +01:00
chat-parser.h cli : fix reasoning responses in CLI (#18961) 2026-01-20 18:23:25 +01:00
chat-peg-parser.cpp common : add nemotron 3 parsing (#18077) 2025-12-16 04:05:23 -06:00
chat-peg-parser.h common : introduce composable PEG parser combinators for chat parsing (#17136) 2025-12-03 12:45:32 +02:00
chat.cpp mtmd : chat : Fix extra \n between text and media marker (#19595) 2026-02-19 12:18:57 +01:00
chat.h chat: fix case where template accepts type content only (#19419) 2026-02-09 22:14:12 +01:00
common.cpp common : make small string helpers as inline functions (#19693) 2026-02-18 08:03:01 +01:00
common.h common : make small string helpers as inline functions (#19693) 2026-02-18 08:03:01 +01:00
console.cpp cli: new CLI experience (#17824) 2025-12-10 15:28:59 +01:00
console.h cli: new CLI experience (#17824) 2025-12-10 15:28:59 +01:00
debug.cpp debug: make common_debug_print_tensor readable (#19331) 2026-02-04 17:55:31 +01:00
debug.h Restore clip's cb() to its rightful glory - extract common debugging elements in llama (#17914) 2026-01-14 20:29:35 +01:00
download.cpp build : remove LLAMA_HTTPLIB option (#19623) 2026-02-15 15:38:50 +01:00
download.h preset: allow named remote preset (#18728) 2026-01-10 15:12:29 +01:00
http.h common : clarify HTTPS build options in error message (#19103) 2026-01-27 06:16:00 +01:00
json-partial.cpp common : Generalized XML-style tool-call parsing with streaming support (GLM 4.5/4.6 + MiniMax M2 + SeedOSS + Kimi-K2 + Qwen3-Coder + Apriel-1.5 + Xiaomi-MiMo) (#16932) 2025-11-18 18:54:15 +01:00
json-partial.h cli : fix reasoning responses in CLI (#18961) 2026-01-20 18:23:25 +01:00
json-schema-to-grammar.cpp common : add nemotron 3 parsing (#18077) 2025-12-16 04:05:23 -06:00
json-schema-to-grammar.h common : add nemotron 3 parsing (#18077) 2025-12-16 04:05:23 -06:00
llguidance.cpp sampling : add support for backend sampling (#17004) 2026-01-04 22:22:16 +02:00
log.cpp cli: new CLI experience (#17824) 2025-12-10 15:28:59 +01:00
log.h cli: new CLI experience (#17824) 2025-12-10 15:28:59 +01:00
ngram-cache.cpp spec : add self‑speculative decoding (no draft model required) + refactor (#18471) 2026-01-28 19:42:42 +02:00
ngram-cache.h spec : add self‑speculative decoding (no draft model required) + refactor (#18471) 2026-01-28 19:42:42 +02:00
ngram-map.cpp llama : correct typos 'occured' and 'occurences' (#19414) 2026-02-11 07:05:31 +01:00
ngram-map.h llama : correct typos 'occured' and 'occurences' (#19414) 2026-02-11 07:05:31 +01:00
ngram-mod.cpp spec : add ngram-mod (#19164) 2026-01-30 18:21:48 +02:00
ngram-mod.h ngram-mod : fix build [no ci] (#19216) 2026-01-30 21:27:27 +02:00
peg-parser.cpp common : add nemotron 3 parsing (#18077) 2025-12-16 04:05:23 -06:00
peg-parser.h common : introduce composable PEG parser combinators for chat parsing (#17136) 2025-12-03 12:45:32 +02:00
preset.cpp preset: allow named remote preset (#18728) 2026-01-10 15:12:29 +01:00
preset.h common: support remote preset (#18520) 2026-01-08 22:35:40 +01:00
regex-partial.cpp common/grammar : replace problematic backtracking regex `[\s\S]*` (#18342) 2026-01-03 16:02:43 -06:00
regex-partial.h `common`: add partial regex support (#12808) 2025-05-14 19:50:57 +01:00
sampling.cpp llama : add adaptive-p sampler (#17927) 2026-01-15 19:16:29 +02:00
sampling.h sampling : add support for backend sampling (#17004) 2026-01-04 22:22:16 +02:00
speculative.cpp spec : remove check rate (#19377) 2026-02-09 15:30:50 +02:00
speculative.h common : add common_speculative_is_compat() (#19270) 2026-02-06 16:47:22 +02:00
unicode.cpp common : introduce composable PEG parser combinators for chat parsing (#17136) 2025-12-03 12:45:32 +02:00
unicode.h common : introduce composable PEG parser combinators for chat parsing (#17136) 2025-12-03 12:45:32 +02:00