Piotr Wilkin (ilintar)
a4230e169e
Merge e384c6fefe into 0ccbfdef3e
2026-02-14 00:28:58 +00:00
Piotr Wilkin
e384c6fefe
Add "marker" PEG parser + sample in analyzer
2026-02-14 01:28:53 +01:00
Piotr Wilkin
e501e1dec9
Basic universal PEG parser wrapper with tag-to-dict based extractor
2026-02-14 00:56:22 +01:00
Piotr Wilkin
3605e78569
Refactor into class-based approach
2026-02-14 00:17:43 +01:00
ymcki
0e21991472
fix vulkan ggml_acc only works in 3d but not 4d ( #19426 )
...
* fix vulkan ggml_acc only works in 3d but not 4d
* removed clamp in test_acc_block
* use the correct stride and its test case
* cuda : fix "supports op" condition
* change src0 to src1 in ggml_vk_acc. Update acc.comp with jeffbolznv\'s suggestion except to keep the boundary check
* version without boundary check
* revert back to boundary check version
---------
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2026-02-13 13:31:37 +01:00
Georgi Gerganov
490eb96b88
metal : support GGML_OP_SET ( #19548 )
2026-02-13 07:34:52 +02:00
Piotr Wilkin
24cc1bcd6d
Clean algorithm for calculate_diff_split; fix buggy expectations
2026-02-13 03:17:20 +01:00
Piotr Wilkin
28fcef67c0
-> Refactor autoparser analyzer structure
...
-> Fix content truncation
-> Fix errors in capability detection due to non-empty assistant message
-> Add missing debug prints for Jinja
2026-02-13 00:55:35 +01:00
Georgi Gerganov
3b3a948134
metal : update sum_rows kernel to support float4 ( #19524 )
2026-02-12 11:35:28 +02:00
Georgi Gerganov
914dde72ba
ggml : unary ops support non-cont src0 + metal F16 unary ops ( #19511 )
...
* ggml : unary ops support non-cont src0
* metal : support F16 unary ops + fix ELU
2026-02-11 18:58:43 +02:00
Piotr Wilkin
29ce31b1a3
Fix windows build
2026-02-11 13:47:30 +01:00
Piotr Wilkin
bd549b3b37
Fix case with object inside object, refactor long methods.
2026-02-11 13:47:29 +01:00
Piotr Wilkin
2081e9b056
Fix number partial parsing issue
2026-02-11 13:47:29 +01:00
Piotr Wilkin
b260de1d86
More edge cases
2026-02-11 13:47:29 +01:00
Piotr Wilkin
60717b3e5a
Fix pesky issue on optional trailing arguments in function calls for TAGGED format
2026-02-11 13:47:29 +01:00
Piotr Wilkin
15f7aa1fbe
We don't like segfaults (or failing tests).
2026-02-11 13:47:29 +01:00
Piotr Wilkin
09b447a487
Fix incorrect coercion of strings to non-string types during parsing
2026-02-11 13:47:29 +01:00
Piotr Wilkin
a01e15280a
Feeding the hungry editor checker god.
2026-02-11 13:47:29 +01:00
Piotr Wilkin
3770566c45
Reverd bad change fix some templates and most tests
2026-02-11 13:47:29 +01:00
Piotr Wilkin
b0853baca7
Quick vibe-coded fix for proper object printing
2026-02-11 13:47:29 +01:00
Piotr Wilkin
1662fa5bea
ANOTHER GIANT POST-FIXUP SQUISH
2026-02-11 13:47:29 +01:00
Piotr Wilkin
7e6f75a414
THE GIANT AUTOPARSER SQUISH
2026-02-11 13:47:29 +01:00
Georgi Gerganov
89181c0b6d
ggml : extend bin bcast for permuted src1 ( #19484 )
...
* tests : extend bin bcast for permuted src1
* cont : extend bin support
* cont : s0 is always 1
* tests : simplify
2026-02-11 07:52:00 +02:00
Georgi Gerganov
ceaa89b786
metal : consolidate unary ops ( #19490 )
2026-02-11 07:51:12 +02:00
Xuan-Son Nguyen
9a96352729
test: fix IMROPE perf test case ( #19465 )
2026-02-10 14:37:50 +01:00
Georgi Gerganov
a0d585537c
cuda : extend GGML_OP_PAD to work with non-cont src0 ( #19429 )
...
* cuda : extend GGML_OP_PAD to work with non-cont src0
* tests : add permuted pad
2026-02-10 08:07:16 +02:00
Hugo
1e8924fd65
cmake : add variable to skip installing tests ( #19370 )
...
When packaging downstream, there's usually little point in installing
test. The default behaviour remains the same.
2026-02-09 07:12:02 +01:00
Jeff Bolz
db6adb3c88
tests: reduce number of FA test permutations ( #19381 )
...
Only test non-F16 for head size 64 and 72 (one a multiple of QK, one not).
2026-02-06 08:50:30 -06:00
Jeff Bolz
449ec2ab07
vulkan: Preprocess FA mask to detect all-neg-inf and all-zero. ( #19281 )
...
Write out a 2-bit code per block and avoid loading the mask when it
matches these two common cases.
Apply this optimization when the mask is relatively large (i.e. prompt
processing).
2026-02-05 09:26:38 -06:00
Georgi Gerganov
eaba92c3dc
tests : add non-cont, inplace rope tests ( #19296 )
...
* tests : add non-cont, inplace rope tests
* cont : exercise dim 3
Co-authored-by: Jeff Bolz <jbolz@nvidia.com>
* cont : more dim3 exercises
---------
Co-authored-by: Jeff Bolz <jbolz@nvidia.com>
2026-02-04 12:45:21 +02:00
Sid Mohan
0dfcd3b607
jinja : add missing 'in' test to template engine ( #19004 ) ( #19239 )
...
* jinja : add missing 'in' test to template engine (#19004 )
The jinja template parser was missing the 'in' test from
global_builtins(), causing templates using reject("in", ...),
select("in", ...), or 'x is in(y)' to fail with
"selectattr: unknown test 'in'".
This broke tool-calling for Qwen3-Coder and any other model
whose chat template uses the 'in' test.
Added test_is_in supporting array, string, and object containment
checks, mirroring the existing 'in' operator logic in runtime.cpp.
Includes test cases for all three containment types plus
reject/select filter usage.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* reuse test_is_in in binary op
---------
Co-authored-by: Sid Mohan <sidmohan0@users.noreply.github.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2026-02-02 21:00:55 +01:00
Aman Gupta
9f682fb640
ggml-cpu: FA split across kv for faster TG ( #19209 )
...
* ggml-cpu: split across kv for faster TG
* simplify sinks application
* add ref impl
2026-02-03 01:19:55 +08:00
Christian Kastner
7a4ca3cbd9
docs : Minor cleanups ( #19252 )
...
* Update old URLs to github.com/ggml-org/
* Bump copyrights
2026-02-02 08:38:55 +02:00
Georgi Gerganov
c3b87cebff
tests : add GQA=20 FA test ( #19095 )
2026-01-30 13:52:57 +02:00
Aldehir Rojas
7b7ae857f6
chat : add parsing for solar-open-100b ( #18540 )
...
* chat : add parsing for solar-open-100b
* add comments to rules
* cont : make assistant start optional
* cont : remove assistant start prefix altogether
---------
Co-authored-by: Piotr Wilkin (ilintar) <piotr.wilkin@syndatis.com>
2026-01-29 16:06:15 +01:00
Sigbjørn Skjæret
b45ef2702c
jinja : do not pass empty tools and add some none filters ( #19176 )
2026-01-29 14:06:54 +01:00
Sigbjørn Skjæret
60368e1d73
jinja : undefined should be treated as sequence/iterable (return string/array) by filters/tests ( #19147 )
...
* undefined is treated as iterable (string/array) by filters
`tojson` is not a supported `undefined` filter
* add tests
* add sequence and iterable tests
keep it DRY and fix some types
2026-01-28 14:40:29 +01:00
Sigbjørn Skjæret
2b4cbd2834
jinja : implement mixed type object keys ( #18955 )
...
* implement mixed type object keys
* add tests
* refactor
* minor fixes
* massive refactor
* add more tests
* forgotten tuples
* fix array/object is_hashable
* correct (albeit broken) jinja responses
verified with transformers
* improved hashing and equality
* refactor hash function
* more exhausive test case
* clean up
* cont
* cont (2)
* missing cstring
---------
Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2026-01-27 19:50:42 +01:00
Johannes Gäßler
b0311c16d2
CUDA: fix padding of GQA to power of 2 in FA ( #19115 )
2026-01-26 23:24:58 +01:00
Johannes Gäßler
4e5b83b226
GGUF: check that tensor size is representable ( #19072 )
2026-01-24 21:57:51 +01:00
Xuan-Son Nguyen
51fa458a92
server : support preserving reasoning_content in assistant message ( #18994 )
...
* support reasoning_content input
* report template caps to webui
* add docs
* rm commented code
2026-01-22 21:30:06 +01:00
Georgi Gerganov
a5eaa1d6a3
mla : make the V tensor a view of K ( #18986 )
...
* mla : pass V as a view of K to the FA op
* cuda : adjust mla logic to new layout
* kv-cache : fix rope shift
* tests : remove comment
* cuda : fix reusable_cutoff
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
---------
Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2026-01-22 22:09:01 +02:00
Piotr Wilkin (ilintar)
c301172f66
jinja: support none|string ( #18995 )
...
* jinja: support none|string
* Update common/jinja/value.cpp
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Update tests/test-jinja.cpp
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* Add as_string()
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-01-21 19:24:37 +01:00
Jeff Bolz
33f890e579
vulkan: support flash attention GQA/split_k with small batches ( #18938 )
2026-01-21 17:43:43 +01:00
Xuan-Son Nguyen
2c1f199653
cli : fix reasoning responses in CLI ( #18961 )
...
* cli : fix reasoning responses in CLI
* fix build
* fix build (2)
2026-01-20 18:23:25 +01:00
Sigbjørn Skjæret
959ecf7f23
jinja : fix undefined keys and attributes and int/float as bool ( #18924 )
...
* fix undefined keys and attributes
* add falsy tests
* as_bool for integers and floats
* more falsy/truthy tests
* --typo
2026-01-19 20:29:43 +01:00
Sigbjørn Skjæret
4037093c66
ci : run test-jinja -py on high perf [no ci] ( #18916 )
2026-01-19 20:29:15 +01:00
Xuan-Son Nguyen
fe44d35574
tests : add test-jinja -py option for cross-checking ( #18906 )
...
* tests : add test-jinja -py option or cross-checking
* Update tests/test-jinja.cpp
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
* fix + add source
* SandboxedEnvironment
* fix array.map case
---------
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-01-18 08:14:27 +01:00
Sigbjørn Skjæret
d03c45c9c5
jinja : attribute support for join, map and sort ( #18883 )
...
* support negative array index and default value
* attribute support (int and str) for join, map and sort
* add tests
* update CODEOWNERS
* improve fixme sorting comment
2026-01-18 02:53:01 +01:00
Sigbjørn Skjæret
10c98cbdf6
jinja : add missing tojson filter for bool ( #18900 )
...
* add missing tojson for bool
* add more literal tests
2026-01-18 01:05:09 +01:00