Piotr Wilkin
e501e1dec9
Basic universal PEG parser wrapper with tag-to-dict based extractor
2026-02-14 00:56:22 +01:00
Piotr Wilkin
0884aad1c5
Remove stupid LLM-generated method comment headers (yeah, we can see what the method name is, thank you very much)
2026-02-14 00:37:33 +01:00
Piotr Wilkin
61e18cad3f
Create basic content parser if no parser definition found
2026-02-14 00:26:17 +01:00
Piotr Wilkin
3605e78569
Refactor into class-based approach
2026-02-14 00:17:43 +01:00
Piotr Wilkin
6415d0f03f
Add TODO
2026-02-13 14:42:26 +01:00
Piotr Wilkin
24cc1bcd6d
Clean algorithm for calculate_diff_split; fix buggy expectations
2026-02-13 03:17:20 +01:00
Piotr Wilkin
e772822011
Whitespace
2026-02-13 00:55:56 +01:00
Piotr Wilkin
28fcef67c0
-> Refactor autoparser analyzer structure
...
-> Fix content truncation
-> Fix errors in capability detection due to non-empty assistant message
-> Add missing debug prints for Jinja
2026-02-13 00:55:35 +01:00
Piotr Wilkin
822fd2bee9
Whoops
2026-02-12 17:22:59 +01:00
Piotr Wilkin
3096ecaa95
One more crazy spacing out
2026-02-11 23:44:52 +01:00
Piotr Wilkin
e40d4cd706
Get rid of some crazy formatting
2026-02-11 22:53:02 +01:00
Piotr Wilkin
56ca124850
Document helpers
2026-02-11 22:42:16 +01:00
Piotr Wilkin
efc52dadc8
Add compilation guard to fix Windows compilation errors
2026-02-11 13:47:30 +01:00
Piotr Wilkin
d69ec41ee0
Post-merge adapt
2026-02-11 13:47:30 +01:00
Piotr Wilkin
e590f31f67
Revert obsolete server-context change
2026-02-11 13:47:30 +01:00
Piotr Wilkin
29ce31b1a3
Fix windows build
2026-02-11 13:47:30 +01:00
Piotr Wilkin
92acde0890
Regenerate documentation
2026-02-11 13:47:30 +01:00
Piotr Wilkin
bd549b3b37
Fix case with object inside object, refactor long methods.
2026-02-11 13:47:29 +01:00
Piotr Wilkin
2081e9b056
Fix number partial parsing issue
2026-02-11 13:47:29 +01:00
Piotr Wilkin
b260de1d86
More edge cases
2026-02-11 13:47:29 +01:00
Piotr Wilkin
60717b3e5a
Fix pesky issue on optional trailing arguments in function calls for TAGGED format
2026-02-11 13:47:29 +01:00
Piotr Wilkin
c2f6fc3a17
Remove [[noreturn]] as it causes compilation problems on Mac.
2026-02-11 13:47:29 +01:00
Piotr Wilkin
15f7aa1fbe
We don't like segfaults (or failing tests).
2026-02-11 13:47:29 +01:00
Piotr Wilkin
f71ae707ba
Fix minor regressions, add [[noreturn]] attrib
2026-02-11 13:47:29 +01:00
Piotr Wilkin
09b447a487
Fix incorrect coercion of strings to non-string types during parsing
2026-02-11 13:47:29 +01:00
Piotr Wilkin
a01e15280a
Feeding the hungry editor checker god.
2026-02-11 13:47:29 +01:00
Piotr Wilkin
384cafc98b
Fix error in argument processing
2026-02-11 13:47:29 +01:00
Piotr Wilkin
3770566c45
Reverd bad change fix some templates and most tests
2026-02-11 13:47:29 +01:00
Piotr Wilkin
9ba9a94819
More robust reasoning detection
2026-02-11 13:47:29 +01:00
Piotr Wilkin
80b7e161ff
Fix reasoning detection
2026-02-11 13:47:29 +01:00
Piotr Wilkin
b0853baca7
Quick vibe-coded fix for proper object printing
2026-02-11 13:47:29 +01:00
Piotr Wilkin
c7029f858d
Missed this.
2026-02-11 13:47:29 +01:00
Piotr Wilkin
1662fa5bea
ANOTHER GIANT POST-FIXUP SQUISH
2026-02-11 13:47:29 +01:00
Piotr Wilkin
7e6f75a414
THE GIANT AUTOPARSER SQUISH
2026-02-11 13:47:29 +01:00
Piotr Wilkin
571805b348
Make call IDs nine-character
2026-02-11 13:47:29 +01:00
Piotr Wilkin
93f0cc05de
Fix sanitizer warnings
2026-02-11 13:47:29 +01:00
Piotr Wilkin
96316496d5
Fix bad typo
2026-02-11 13:47:29 +01:00
Piotr Wilkin
9a3ac05157
Add workaround for templates requiring non-null content
2026-02-11 13:47:29 +01:00
Johannes Gäßler
ada90bf2ba
docs: ban AI for issues and discussions [no CI] ( #19512 )
2026-02-11 12:49:40 +01:00
Adrien Gallouët
0c1f39a9ae
common : improve download error reporting ( #19491 )
...
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
2026-02-11 09:27:55 +01:00
Max Krasnyansky
73cd5e1b97
hexagon: Add ARGSORT, DIV, SQR, SQRT, SUM_ROWS, GEGLU ( #19406 )
...
* hexagon: add ARGSORT op
Co-authored-by: Yarden Tal <yardent@qti.qualcomm.com>
* hexagon: argsort reject tensors with huge rows for now
* Adding support for DIV,SQR,SQRT,SUM_ROWS ops in hexagon backend
* hexagon : Add GEGLU op
* hexagon: fix editor config check
* hexagon: rewrite and optimize binary ops ADD/SUB/MUL/DIV/ADD_ID to use DMA
---------
Co-authored-by: Yarden Tal <yardent@qti.qualcomm.com>
Co-authored-by: Manohara Hosakoppa Krishnamurthy <mhosakop@qti.qualcomm.com>
2026-02-10 23:21:12 -08:00
thecaptain789
8ee538ce73
llama : correct typos 'occured' and 'occurences' ( #19414 )
...
Co-authored-by: thecaptain789 <thecaptain789@users.noreply.github.com>
2026-02-11 07:05:31 +01:00
Georgi Gerganov
6d95707827
model : fix wavtokenizer embedding notions ( #19479 )
2026-02-11 07:52:20 +02:00
Georgi Gerganov
89181c0b6d
ggml : extend bin bcast for permuted src1 ( #19484 )
...
* tests : extend bin bcast for permuted src1
* cont : extend bin support
* cont : s0 is always 1
* tests : simplify
2026-02-11 07:52:00 +02:00
Georgi Gerganov
ceaa89b786
metal : consolidate unary ops ( #19490 )
2026-02-11 07:51:12 +02:00
Daniel Bevenius
2cce9fddb7
llama : refactor sampling_info to use buffer_view template ( #19368 )
...
* llama : refactor sampling_info to use buffer_view template
This commit updates the sampling_info struct in llama-context to use a
buffer_view template for the logits, probs, sampled tokens, and
candidates buffers.
The motivation for this is to simplify the code, improve type safety
and readability.
2026-02-11 05:38:13 +01:00
Oliver Simons
612db61886
CUDA : Update CCCL-tag for 3.2 to final release from RC ( #19486 )
...
CCCL 3.2 has been released since it was added to llama.cpp as part of
the backend-sampling PR, and it makes sense to update from RC to final
released version.
https://github.com/NVIDIA/cccl/releases/tag/v3.2.0
2026-02-10 22:31:19 +01:00
Nikhil Jain
57487a64c8
[WebGPU] Plug memory leaks and free resources on shutdown ( #19315 )
...
* Fix memory leaks in shader lib, backend, backend_context, buffer_context, and webgpu_buf_pool
* Free pools
* Cleanup
* More cleanup
* Run clang-format
* Fix arg-parser and tokenizer test errors that free an unallocated buffer
* Fix device lost callback to not print on device teardown
* Fix include and run clang-format
* remove unused unused
* Update binary ops
---------
Co-authored-by: Reese Levine <reeselevine1@gmail.com>
2026-02-10 08:04:00 -08:00
JJJYmmm
fc0fe40049
models : support qwen3.5 series ( #19468 )
...
* support qwen3.5 series
* remove deepstack for now, and some code clean
* code clean
* add FULL_ATTENTION_INTERVAL metadata
* code clean
* reorder v heads for linear attention to avoid expensive interleaved repeat
2026-02-10 18:00:26 +02:00
Xuan-Son Nguyen
9a96352729
test: fix IMROPE perf test case ( #19465 )
2026-02-10 14:37:50 +01:00