Piotr Wilkin
3605e78569
Refactor into class-based approach
2026-02-14 00:17:43 +01:00
Piotr Wilkin
6415d0f03f
Add TODO
2026-02-13 14:42:26 +01:00
Piotr Wilkin
24cc1bcd6d
Clean algorithm for calculate_diff_split; fix buggy expectations
2026-02-13 03:17:20 +01:00
Piotr Wilkin
e772822011
Whitespace
2026-02-13 00:55:56 +01:00
Piotr Wilkin
28fcef67c0
-> Refactor autoparser analyzer structure
...
-> Fix content truncation
-> Fix errors in capability detection due to non-empty assistant message
-> Add missing debug prints for Jinja
2026-02-13 00:55:35 +01:00
Piotr Wilkin
822fd2bee9
Whoops
2026-02-12 17:22:59 +01:00
Piotr Wilkin
3096ecaa95
One more crazy spacing out
2026-02-11 23:44:52 +01:00
Piotr Wilkin
e40d4cd706
Get rid of some crazy formatting
2026-02-11 22:53:02 +01:00
Piotr Wilkin
56ca124850
Document helpers
2026-02-11 22:42:16 +01:00
Piotr Wilkin
d69ec41ee0
Post-merge adapt
2026-02-11 13:47:30 +01:00
Piotr Wilkin
bd549b3b37
Fix case with object inside object, refactor long methods.
2026-02-11 13:47:29 +01:00
Piotr Wilkin
2081e9b056
Fix number partial parsing issue
2026-02-11 13:47:29 +01:00
Piotr Wilkin
b260de1d86
More edge cases
2026-02-11 13:47:29 +01:00
Piotr Wilkin
60717b3e5a
Fix pesky issue on optional trailing arguments in function calls for TAGGED format
2026-02-11 13:47:29 +01:00
Piotr Wilkin
c2f6fc3a17
Remove [[noreturn]] as it causes compilation problems on Mac.
2026-02-11 13:47:29 +01:00
Piotr Wilkin
f71ae707ba
Fix minor regressions, add [[noreturn]] attrib
2026-02-11 13:47:29 +01:00
Piotr Wilkin
09b447a487
Fix incorrect coercion of strings to non-string types during parsing
2026-02-11 13:47:29 +01:00
Piotr Wilkin
a01e15280a
Feeding the hungry editor checker god.
2026-02-11 13:47:29 +01:00
Piotr Wilkin
384cafc98b
Fix error in argument processing
2026-02-11 13:47:29 +01:00
Piotr Wilkin
3770566c45
Reverd bad change fix some templates and most tests
2026-02-11 13:47:29 +01:00
Piotr Wilkin
9ba9a94819
More robust reasoning detection
2026-02-11 13:47:29 +01:00
Piotr Wilkin
80b7e161ff
Fix reasoning detection
2026-02-11 13:47:29 +01:00
Piotr Wilkin
b0853baca7
Quick vibe-coded fix for proper object printing
2026-02-11 13:47:29 +01:00
Piotr Wilkin
1662fa5bea
ANOTHER GIANT POST-FIXUP SQUISH
2026-02-11 13:47:29 +01:00
Piotr Wilkin
7e6f75a414
THE GIANT AUTOPARSER SQUISH
2026-02-11 13:47:29 +01:00
Piotr Wilkin
571805b348
Make call IDs nine-character
2026-02-11 13:47:29 +01:00
Piotr Wilkin
93f0cc05de
Fix sanitizer warnings
2026-02-11 13:47:29 +01:00
Piotr Wilkin
96316496d5
Fix bad typo
2026-02-11 13:47:29 +01:00
Piotr Wilkin
9a3ac05157
Add workaround for templates requiring non-null content
2026-02-11 13:47:29 +01:00
Adrien Gallouët
0c1f39a9ae
common : improve download error reporting ( #19491 )
...
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2026-02-11 09:27:55 +01:00
thecaptain789
8ee538ce73
llama : correct typos 'occured' and 'occurences' ( #19414 )
...
Co-authored-by: thecaptain789 <thecaptain789@users.noreply.github.com>
2026-02-11 07:05:31 +01:00
Xuan-Son Nguyen
98e57ca422
chat: fix case where template accepts type content only ( #19419 )
...
* chat: fix case where template accepts type content only
* rm stray log
* reuse render_message_to_json
2026-02-09 22:14:12 +01:00
Sascha Rogmann
292f6908cd
spec : remove check rate ( #19377 )
...
* spec: remove parameter spec-ngram-check-rate
* spec : renamed statistics vars
* spec : add n_call_begin, n_call_accept
* spec : don't enable key-map-stats
2026-02-09 15:30:50 +02:00
Georgi Gerganov
dfde5993ea
common : add common_speculative_is_compat() ( #19270 )
...
* llama : add llama_memory_can_rm_suffix()
* Revert "llama : add llama_memory_can_rm_suffix()"
This reverts commit d30e59b62a .
* spec : check if the target context is compatible for spec decoding
2026-02-06 16:47:22 +02:00
Xuan-Son Nguyen
e0c93af2a0
debug: make common_debug_print_tensor readable ( #19331 )
...
* debug: make common_debug_print_tensor readable
* editorconfig
2026-02-04 17:55:31 +01:00
Georgi Gerganov
d838c22bb3
spec : fix the check-rate logic of ngram-simple ( #19261 )
...
* spec : fix the check-rate logic of ngram-simple
* cont : refactor + fix checks
2026-02-04 10:39:53 +02:00
Georgi Gerganov
aeb827a3cc
spec : simplify time measurement using common_time_meas ( #19262 )
2026-02-03 08:20:15 +02:00
Sid Mohan
0dfcd3b607
jinja : add missing 'in' test to template engine ( #19004 ) ( #19239 )
...
* jinja : add missing 'in' test to template engine (#19004 )
The jinja template parser was missing the 'in' test from
global_builtins(), causing templates using reject("in", ...),
select("in", ...), or 'x is in(y)' to fail with
"selectattr: unknown test 'in'".
This broke tool-calling for Qwen3-Coder and any other model
whose chat template uses the 'in' test.
Added test_is_in supporting array, string, and object containment
checks, mirroring the existing 'in' operator logic in runtime.cpp.
Includes test cases for all three containment types plus
reject/select filter usage.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* reuse test_is_in in binary op
---------
Co-authored-by: Sid Mohan <sidmohan0@users.noreply.github.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2026-02-02 21:00:55 +01:00
Sascha Rogmann
b4d05a3d2f
spec : various improvements ton ngram-map + docs ( #19253 )
...
* spec: ngram-map and reasoning chats
* spec: add t_begin and t_accept
* ngram-map : add internal hash map
* docs : update ngram-map, add ngram-mod
* docs : fix ngram-map-k
* docs : differences between implementations
2026-02-02 08:26:58 +02:00
Georgi Gerganov
4927795810
ngram-mod : fix build [no ci] ( #19216 )
2026-01-30 21:27:27 +02:00
Georgi Gerganov
dabaa2e77a
spec : add ngram-mod ( #19164 )
...
* spec : add ngram-mod
* cont : simplify + keep track of occupancy
* cont : cleanup
* cont : move initialization to common/speculative
* cont : cleanup
* cont : cleanup
* cont : fix
2026-01-30 18:21:48 +02:00
Marcello Seri
2e916f996a
jinja : add unordered_map include to value.h [no ci] ( #19205 )
...
On macos Sequoia 15.7.3, x86_64, the build has recently started failing with
```
In file included from .../code/cpp/llama.cpp/common/jinja/string.cpp:2:
.../code/cpp/llama.cpp/common/./jinja/value.h:478:10: error: no template named 'unordered_map' in namespace 'std'
478 | std::unordered_map<value, value, value_hasher, value_equivalence> unordered;
| ~~~~~^
In file included from .../code/cpp/llama.cpp/common/jinja/caps.cpp:1:
.../code/cpp/llama.cpp/common/jinja/value.h:478:10: error: no template named 'unordered_map' in namespace 'std'
478 | std::unordered_map<value, value, value_hasher, value_equivalence> unordered;
| ~~~~~^
In file included from .../code/cpp/llama.cpp/common/jinja/value.cpp:1:
In file included from .../code/cpp/llama.cpp/common/jinja/runtime.h:4:
.../code/cpp/llama.cpp/common/jinja/value.h:478:10: error: no template named 'unordered_map' in namespace 'std'
478 | std::unordered_map<value, value, value_hasher, value_equivalence> unordered;
[...]
```
After a bit of digging to make sure all the appropriate flags were used, I notifced that the necessary header was not included. This fixes the build for me and should not affect negatively other builds that for some reasons were already succeeding
2026-01-30 16:09:44 +01:00
Aldehir Rojas
7b7ae857f6
chat : add parsing for solar-open-100b ( #18540 )
...
* chat : add parsing for solar-open-100b
* add comments to rules
* cont : make assistant start optional
* cont : remove assistant start prefix altogether
---------
Co-authored-by: Piotr Wilkin (ilintar) <piotr.wilkin@syndatis.com>
2026-01-29 16:06:15 +01:00
Sigbjørn Skjæret
b45ef2702c
jinja : do not pass empty tools and add some none filters ( #19176 )
2026-01-29 14:06:54 +01:00
Georgi Gerganov
eed25bc6b0
arg : add -kvu to llama-batched-bench ( #19172 )
2026-01-29 08:50:47 +02:00
Sascha Rogmann
72d3b1898a
spec : add self‑speculative decoding (no draft model required) + refactor ( #18471 )
...
* server: introduce self-speculative decoding
* server: moved self-call into speculative.cpp
* can_speculate() includes self-speculation
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* server: can_speculate() tests self-spec
* server: replace can_speculate() with slot.can_speculate()
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* common: use %zu format specifier for size_t in logging
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* server: can_speculate() requires a task instance
* common: ngram map, config self-speculative decoding
* common: add enum common_speculative_type
* common: add vector of speculative states
* common: add option --spec-draftless
* server: cleanup (remove slot.batch_spec, rename)
* common: moved self-spec impl to ngram-map
* common: cleanup (use common_speculative_state_draft)
* spec : refactor
* cont : naming
* spec: remove --spec-config
* doc: (draftless) speculative decoding
* common: print performance in spec decoding
* minor : cleanup
* common : better names
* minor : cleanup + fix build
* minor: comments
* CODEOWNERS: add common/ngram-map.* (#18471 )
* common : rename speculative.draftless_type -> speculative.type
* ngram-map : fix uninitialized values
* ngram-map : take into account the input can become shorter
* ngram-map : revert len check for now
* arg : change `--spec-draftless` -> `--spec-type`
* spec : add common_speculative_state::accept()
* spec : refactor + add common_speculative_begin()
* spec : fix begin() call with mtmd
* spec : additional refactor + remove common_speculative_params
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-01-28 19:42:42 +02:00
Sigbjørn Skjæret
60368e1d73
jinja : undefined should be treated as sequence/iterable (return string/array) by filters/tests ( #19147 )
...
* undefined is treated as iterable (string/array) by filters
`tojson` is not a supported `undefined` filter
* add tests
* add sequence and iterable tests
keep it DRY and fix some types
2026-01-28 14:40:29 +01:00
Georgi Gerganov
631cbfcc7a
cuda : fix "V is K view" check for non-unified KV cache ( #19145 )
2026-01-28 09:15:27 +02:00
Georgi Gerganov
c5c64f72ac
llama : disable Direct IO by default ( #19109 )
...
* llama : disable Direct IO by default
* cont : override mmap if supported
2026-01-28 09:11:13 +02:00
Sigbjørn Skjæret
2b4cbd2834
jinja : implement mixed type object keys ( #18955 )
...
* implement mixed type object keys
* add tests
* refactor
* minor fixes
* massive refactor
* add more tests
* forgotten tuples
* fix array/object is_hashable
* correct (albeit broken) jinja responses
verified with transformers
* improved hashing and equality
* refactor hash function
* more exhausive test case
* clean up
* cont
* cont (2)
* missing cstring
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2026-01-27 19:50:42 +01:00