Commit Graph

755 Commits

Author SHA1 Message Date
Piotr Wilkin 6415d0f03f Add TODO 2026-02-13 14:42:26 +01:00
Piotr Wilkin 24cc1bcd6d Clean algorithm for calculate_diff_split; fix buggy expectations 2026-02-13 03:17:20 +01:00
Piotr Wilkin e772822011 Whitespace 2026-02-13 00:55:56 +01:00
Piotr Wilkin 28fcef67c0 -> Refactor autoparser analyzer structure
-> Fix content truncation
-> Fix errors in capability detection due to non-empty assistant message
-> Add missing debug prints for Jinja
2026-02-13 00:55:35 +01:00
Piotr Wilkin 822fd2bee9 Whoops 2026-02-12 17:22:59 +01:00
Piotr Wilkin 3096ecaa95 One more crazy spacing out 2026-02-11 23:44:52 +01:00
Piotr Wilkin e40d4cd706 Get rid of some crazy formatting 2026-02-11 22:53:02 +01:00
Piotr Wilkin 56ca124850 Document helpers 2026-02-11 22:42:16 +01:00
Piotr Wilkin d69ec41ee0 Post-merge adapt 2026-02-11 13:47:30 +01:00
Piotr Wilkin bd549b3b37 Fix case with object inside object, refactor long methods. 2026-02-11 13:47:29 +01:00
Piotr Wilkin 2081e9b056 Fix number partial parsing issue 2026-02-11 13:47:29 +01:00
Piotr Wilkin b260de1d86 More edge cases 2026-02-11 13:47:29 +01:00
Piotr Wilkin 60717b3e5a Fix pesky issue on optional trailing arguments in function calls for TAGGED format 2026-02-11 13:47:29 +01:00
Piotr Wilkin c2f6fc3a17 Remove [[noreturn]] as it causes compilation problems on Mac. 2026-02-11 13:47:29 +01:00
Piotr Wilkin f71ae707ba Fix minor regressions, add [[noreturn]] attrib 2026-02-11 13:47:29 +01:00
Piotr Wilkin 09b447a487 Fix incorrect coercion of strings to non-string types during parsing 2026-02-11 13:47:29 +01:00
Piotr Wilkin a01e15280a Feeding the hungry editor checker god. 2026-02-11 13:47:29 +01:00
Piotr Wilkin 384cafc98b Fix error in argument processing 2026-02-11 13:47:29 +01:00
Piotr Wilkin 3770566c45 Reverd bad change fix some templates and most tests 2026-02-11 13:47:29 +01:00
Piotr Wilkin 9ba9a94819 More robust reasoning detection 2026-02-11 13:47:29 +01:00
Piotr Wilkin 80b7e161ff Fix reasoning detection 2026-02-11 13:47:29 +01:00
Piotr Wilkin b0853baca7 Quick vibe-coded fix for proper object printing 2026-02-11 13:47:29 +01:00
Piotr Wilkin 1662fa5bea ANOTHER GIANT POST-FIXUP SQUISH 2026-02-11 13:47:29 +01:00
Piotr Wilkin 7e6f75a414 THE GIANT AUTOPARSER SQUISH 2026-02-11 13:47:29 +01:00
Piotr Wilkin 571805b348 Make call IDs nine-character 2026-02-11 13:47:29 +01:00
Piotr Wilkin 93f0cc05de Fix sanitizer warnings 2026-02-11 13:47:29 +01:00
Piotr Wilkin 96316496d5 Fix bad typo 2026-02-11 13:47:29 +01:00
Piotr Wilkin 9a3ac05157 Add workaround for templates requiring non-null content 2026-02-11 13:47:29 +01:00
Adrien Gallouët 0c1f39a9ae
common : improve download error reporting (#19491)
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2026-02-11 09:27:55 +01:00
thecaptain789 8ee538ce73
llama : correct typos 'occured' and 'occurences' (#19414)
Co-authored-by: thecaptain789 <thecaptain789@users.noreply.github.com>
2026-02-11 07:05:31 +01:00
Xuan-Son Nguyen 98e57ca422
chat: fix case where template accepts type content only (#19419)
* chat: fix case where template accepts type content only

* rm stray log

* reuse render_message_to_json
2026-02-09 22:14:12 +01:00
Sascha Rogmann 292f6908cd
spec : remove check rate (#19377)
* spec: remove parameter spec-ngram-check-rate

* spec : renamed statistics vars

* spec : add n_call_begin, n_call_accept

* spec : don't enable key-map-stats
2026-02-09 15:30:50 +02:00
Georgi Gerganov dfde5993ea
common : add common_speculative_is_compat() (#19270)
* llama : add llama_memory_can_rm_suffix()

* Revert "llama : add llama_memory_can_rm_suffix()"

This reverts commit d30e59b62a.

* spec : check if the target context is compatible for spec decoding
2026-02-06 16:47:22 +02:00
Xuan-Son Nguyen e0c93af2a0
debug: make common_debug_print_tensor readable (#19331)
* debug: make common_debug_print_tensor readable

* editorconfig
2026-02-04 17:55:31 +01:00
Georgi Gerganov d838c22bb3
spec : fix the check-rate logic of ngram-simple (#19261)
* spec : fix the check-rate logic of ngram-simple

* cont : refactor + fix checks
2026-02-04 10:39:53 +02:00
Georgi Gerganov aeb827a3cc
spec : simplify time measurement using common_time_meas (#19262) 2026-02-03 08:20:15 +02:00
Sid Mohan 0dfcd3b607
jinja : add missing 'in' test to template engine (#19004) (#19239)
* jinja : add missing 'in' test to template engine (#19004)

The jinja template parser was missing the 'in' test from
global_builtins(), causing templates using reject("in", ...),
select("in", ...), or 'x is in(y)' to fail with
"selectattr: unknown test 'in'".

This broke tool-calling for Qwen3-Coder and any other model
whose chat template uses the 'in' test.

Added test_is_in supporting array, string, and object containment
checks, mirroring the existing 'in' operator logic in runtime.cpp.

Includes test cases for all three containment types plus
reject/select filter usage.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* reuse test_is_in in binary op

---------

Co-authored-by: Sid Mohan <sidmohan0@users.noreply.github.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2026-02-02 21:00:55 +01:00
Sascha Rogmann b4d05a3d2f
spec : various improvements ton ngram-map + docs (#19253)
* spec: ngram-map and reasoning chats

* spec: add t_begin and t_accept

* ngram-map : add internal hash map

* docs : update ngram-map, add ngram-mod

* docs : fix ngram-map-k

* docs : differences between implementations
2026-02-02 08:26:58 +02:00
Georgi Gerganov 4927795810
ngram-mod : fix build [no ci] (#19216) 2026-01-30 21:27:27 +02:00
Georgi Gerganov dabaa2e77a
spec : add ngram-mod (#19164)
* spec : add ngram-mod

* cont : simplify + keep track of occupancy

* cont : cleanup

* cont : move initialization to common/speculative

* cont : cleanup

* cont : cleanup

* cont : fix
2026-01-30 18:21:48 +02:00
Marcello Seri 2e916f996a
jinja : add unordered_map include to value.h [no ci] (#19205)
On macos Sequoia 15.7.3, x86_64, the build has recently started failing with
```
In file included from .../code/cpp/llama.cpp/common/jinja/string.cpp:2:
.../code/cpp/llama.cpp/common/./jinja/value.h:478:10: error: no template named 'unordered_map' in namespace 'std'
  478 |     std::unordered_map<value, value, value_hasher, value_equivalence> unordered;
      |     ~~~~~^
In file included from .../code/cpp/llama.cpp/common/jinja/caps.cpp:1:
.../code/cpp/llama.cpp/common/jinja/value.h:478:10: error: no template named 'unordered_map' in namespace 'std'
  478 |     std::unordered_map<value, value, value_hasher, value_equivalence> unordered;
      |     ~~~~~^
In file included from .../code/cpp/llama.cpp/common/jinja/value.cpp:1:
In file included from .../code/cpp/llama.cpp/common/jinja/runtime.h:4:
.../code/cpp/llama.cpp/common/jinja/value.h:478:10: error: no template named 'unordered_map' in namespace 'std'
  478 |     std::unordered_map<value, value, value_hasher, value_equivalence> unordered;
[...]
```

After a bit of digging to make sure all the appropriate flags were used, I notifced that the necessary header was not included. This fixes the build for me and should not affect negatively other builds that for some reasons were already succeeding
2026-01-30 16:09:44 +01:00
Aldehir Rojas 7b7ae857f6
chat : add parsing for solar-open-100b (#18540)
* chat : add parsing for solar-open-100b

* add comments to rules

* cont : make assistant start optional

* cont : remove assistant start prefix altogether

---------

Co-authored-by: Piotr Wilkin (ilintar) <piotr.wilkin@syndatis.com>
2026-01-29 16:06:15 +01:00
Sigbjørn Skjæret b45ef2702c
jinja : do not pass empty tools and add some none filters (#19176) 2026-01-29 14:06:54 +01:00
Georgi Gerganov eed25bc6b0
arg : add -kvu to llama-batched-bench (#19172) 2026-01-29 08:50:47 +02:00
Sascha Rogmann 72d3b1898a
spec : add self‑speculative decoding (no draft model required) + refactor (#18471)
* server: introduce self-speculative decoding

* server: moved self-call into speculative.cpp

* can_speculate() includes self-speculation

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* server: can_speculate() tests self-spec

* server: replace can_speculate() with slot.can_speculate()

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* common: use %zu format specifier for size_t in logging

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* server: can_speculate() requires a task instance

* common: ngram map, config self-speculative decoding

* common: add enum common_speculative_type

* common: add vector of speculative states

* common: add option --spec-draftless

* server: cleanup (remove slot.batch_spec, rename)

* common: moved self-spec impl to ngram-map

* common: cleanup (use common_speculative_state_draft)

* spec : refactor

* cont : naming

* spec: remove --spec-config

* doc: (draftless) speculative decoding

* common: print performance in spec decoding

* minor : cleanup

* common : better names

* minor : cleanup + fix build

* minor: comments

* CODEOWNERS: add common/ngram-map.* (#18471)

* common : rename speculative.draftless_type -> speculative.type

* ngram-map : fix uninitialized values

* ngram-map : take into account the input can become shorter

* ngram-map : revert len check for now

* arg : change `--spec-draftless` -> `--spec-type`

* spec : add common_speculative_state::accept()

* spec : refactor + add common_speculative_begin()

* spec : fix begin() call with mtmd

* spec : additional refactor + remove common_speculative_params

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-01-28 19:42:42 +02:00
Sigbjørn Skjæret 60368e1d73
jinja : undefined should be treated as sequence/iterable (return string/array) by filters/tests (#19147)
* undefined is treated as iterable (string/array) by filters

`tojson` is not a supported `undefined` filter

* add tests

* add sequence and iterable tests

keep it DRY and fix some types
2026-01-28 14:40:29 +01:00
Georgi Gerganov 631cbfcc7a
cuda : fix "V is K view" check for non-unified KV cache (#19145) 2026-01-28 09:15:27 +02:00
Georgi Gerganov c5c64f72ac
llama : disable Direct IO by default (#19109)
* llama : disable Direct IO by default

* cont : override mmap if supported
2026-01-28 09:11:13 +02:00
Sigbjørn Skjæret 2b4cbd2834
jinja : implement mixed type object keys (#18955)
* implement mixed type object keys

* add tests

* refactor

* minor fixes

* massive refactor

* add more tests

* forgotten tuples

* fix array/object is_hashable

* correct (albeit broken) jinja responses

verified with transformers

* improved hashing and equality

* refactor hash function

* more exhausive test case

* clean up

* cont

* cont (2)

* missing cstring

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2026-01-27 19:50:42 +01:00
Daniel Bevenius fc3cdf32ce
common : clarify HTTPS build options in error message (#19103)
* common : clarify HTTPS build options in error message

This commit updates the https error message to provide clearer
instructions for users who encounter the "HTTPS is not supported" error.

The motivation for this is that it might not be clear to users that only
one of these options are needed to enable HTTPS support.
The LLAMA_OPENSSL option is also added to the message to cover all
possible build configurations.

* clarify that OpenSSL is the default for HTTPS support
2026-01-27 06:16:00 +01:00