Commit Graph

668 Commits

Author SHA1 Message Date
Piotr Wilkin (ilintar) a4230e169e
Merge e384c6fefe into 0ccbfdef3e 2026-02-14 00:28:58 +00:00
Piotr Wilkin e384c6fefe Add "marker" PEG parser + sample in analyzer 2026-02-14 01:28:53 +01:00
Piotr Wilkin e501e1dec9 Basic universal PEG parser wrapper with tag-to-dict based extractor 2026-02-14 00:56:22 +01:00
Piotr Wilkin 3605e78569 Refactor into class-based approach 2026-02-14 00:17:43 +01:00
ymcki 0e21991472
fix vulkan ggml_acc only works in 3d but not 4d (#19426)
* fix vulkan ggml_acc only works in 3d but not 4d

* removed clamp in test_acc_block

* use the correct stride and its test case

* cuda : fix "supports op" condition

* change src0 to src1 in ggml_vk_acc. Update acc.comp with jeffbolznv\'s suggestion except to keep the boundary check

* version without boundary check

* revert back to boundary check version

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2026-02-13 13:31:37 +01:00
Georgi Gerganov 490eb96b88
metal : support GGML_OP_SET (#19548) 2026-02-13 07:34:52 +02:00
Piotr Wilkin 24cc1bcd6d Clean algorithm for calculate_diff_split; fix buggy expectations 2026-02-13 03:17:20 +01:00
Piotr Wilkin 28fcef67c0 -> Refactor autoparser analyzer structure
-> Fix content truncation
-> Fix errors in capability detection due to non-empty assistant message
-> Add missing debug prints for Jinja
2026-02-13 00:55:35 +01:00
Georgi Gerganov 3b3a948134
metal : update sum_rows kernel to support float4 (#19524) 2026-02-12 11:35:28 +02:00
Georgi Gerganov 914dde72ba
ggml : unary ops support non-cont src0 + metal F16 unary ops (#19511)
* ggml : unary ops support non-cont src0

* metal : support F16 unary ops + fix ELU
2026-02-11 18:58:43 +02:00
Piotr Wilkin 29ce31b1a3 Fix windows build 2026-02-11 13:47:30 +01:00
Piotr Wilkin bd549b3b37 Fix case with object inside object, refactor long methods. 2026-02-11 13:47:29 +01:00
Piotr Wilkin 2081e9b056 Fix number partial parsing issue 2026-02-11 13:47:29 +01:00
Piotr Wilkin b260de1d86 More edge cases 2026-02-11 13:47:29 +01:00
Piotr Wilkin 60717b3e5a Fix pesky issue on optional trailing arguments in function calls for TAGGED format 2026-02-11 13:47:29 +01:00
Piotr Wilkin 15f7aa1fbe We don't like segfaults (or failing tests). 2026-02-11 13:47:29 +01:00
Piotr Wilkin 09b447a487 Fix incorrect coercion of strings to non-string types during parsing 2026-02-11 13:47:29 +01:00
Piotr Wilkin a01e15280a Feeding the hungry editor checker god. 2026-02-11 13:47:29 +01:00
Piotr Wilkin 3770566c45 Reverd bad change fix some templates and most tests 2026-02-11 13:47:29 +01:00
Piotr Wilkin b0853baca7 Quick vibe-coded fix for proper object printing 2026-02-11 13:47:29 +01:00
Piotr Wilkin 1662fa5bea ANOTHER GIANT POST-FIXUP SQUISH 2026-02-11 13:47:29 +01:00
Piotr Wilkin 7e6f75a414 THE GIANT AUTOPARSER SQUISH 2026-02-11 13:47:29 +01:00
Georgi Gerganov 89181c0b6d
ggml : extend bin bcast for permuted src1 (#19484)
* tests : extend bin bcast for permuted src1

* cont : extend bin support

* cont : s0 is always 1

* tests : simplify
2026-02-11 07:52:00 +02:00
Georgi Gerganov ceaa89b786
metal : consolidate unary ops (#19490) 2026-02-11 07:51:12 +02:00
Xuan-Son Nguyen 9a96352729
test: fix IMROPE perf test case (#19465) 2026-02-10 14:37:50 +01:00
Georgi Gerganov a0d585537c
cuda : extend GGML_OP_PAD to work with non-cont src0 (#19429)
* cuda : extend GGML_OP_PAD to work with non-cont src0

* tests : add permuted pad
2026-02-10 08:07:16 +02:00
Hugo 1e8924fd65
cmake : add variable to skip installing tests (#19370)
When packaging downstream, there's usually little point in installing
test. The default behaviour remains the same.
2026-02-09 07:12:02 +01:00
Jeff Bolz db6adb3c88
tests: reduce number of FA test permutations (#19381)
Only test non-F16 for head size 64 and 72 (one a multiple of QK, one not).
2026-02-06 08:50:30 -06:00
Jeff Bolz 449ec2ab07
vulkan: Preprocess FA mask to detect all-neg-inf and all-zero. (#19281)
Write out a 2-bit code per block and avoid loading the mask when it
matches these two common cases.

Apply this optimization when the mask is relatively large (i.e. prompt
processing).
2026-02-05 09:26:38 -06:00
Georgi Gerganov eaba92c3dc
tests : add non-cont, inplace rope tests (#19296)
* tests : add non-cont, inplace rope tests

* cont : exercise dim 3

Co-authored-by: Jeff Bolz <jbolz@nvidia.com>

* cont : more dim3 exercises

---------

Co-authored-by: Jeff Bolz <jbolz@nvidia.com>
2026-02-04 12:45:21 +02:00
Sid Mohan 0dfcd3b607
jinja : add missing 'in' test to template engine (#19004) (#19239)
* jinja : add missing 'in' test to template engine (#19004)

The jinja template parser was missing the 'in' test from
global_builtins(), causing templates using reject("in", ...),
select("in", ...), or 'x is in(y)' to fail with
"selectattr: unknown test 'in'".

This broke tool-calling for Qwen3-Coder and any other model
whose chat template uses the 'in' test.

Added test_is_in supporting array, string, and object containment
checks, mirroring the existing 'in' operator logic in runtime.cpp.

Includes test cases for all three containment types plus
reject/select filter usage.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* reuse test_is_in in binary op

---------

Co-authored-by: Sid Mohan <sidmohan0@users.noreply.github.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2026-02-02 21:00:55 +01:00
Aman Gupta 9f682fb640
ggml-cpu: FA split across kv for faster TG (#19209)
* ggml-cpu: split across kv for faster TG

* simplify sinks application

* add ref impl
2026-02-03 01:19:55 +08:00
Christian Kastner 7a4ca3cbd9
docs : Minor cleanups (#19252)
* Update old URLs to github.com/ggml-org/

* Bump copyrights
2026-02-02 08:38:55 +02:00
Georgi Gerganov c3b87cebff
tests : add GQA=20 FA test (#19095) 2026-01-30 13:52:57 +02:00
Aldehir Rojas 7b7ae857f6
chat : add parsing for solar-open-100b (#18540)
* chat : add parsing for solar-open-100b

* add comments to rules

* cont : make assistant start optional

* cont : remove assistant start prefix altogether

---------

Co-authored-by: Piotr Wilkin (ilintar) <piotr.wilkin@syndatis.com>
2026-01-29 16:06:15 +01:00
Sigbjørn Skjæret b45ef2702c
jinja : do not pass empty tools and add some none filters (#19176) 2026-01-29 14:06:54 +01:00
Sigbjørn Skjæret 60368e1d73
jinja : undefined should be treated as sequence/iterable (return string/array) by filters/tests (#19147)
* undefined is treated as iterable (string/array) by filters

`tojson` is not a supported `undefined` filter

* add tests

* add sequence and iterable tests

keep it DRY and fix some types
2026-01-28 14:40:29 +01:00
Sigbjørn Skjæret 2b4cbd2834
jinja : implement mixed type object keys (#18955)
* implement mixed type object keys

* add tests

* refactor

* minor fixes

* massive refactor

* add more tests

* forgotten tuples

* fix array/object is_hashable

* correct (albeit broken) jinja responses

verified with transformers

* improved hashing and equality

* refactor hash function

* more exhausive test case

* clean up

* cont

* cont (2)

* missing cstring

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>
2026-01-27 19:50:42 +01:00
Johannes Gäßler b0311c16d2
CUDA: fix padding of GQA to power of 2 in FA (#19115) 2026-01-26 23:24:58 +01:00
Johannes Gäßler 4e5b83b226
GGUF: check that tensor size is representable (#19072) 2026-01-24 21:57:51 +01:00
Xuan-Son Nguyen 51fa458a92
server : support preserving reasoning_content in assistant message (#18994)
* support reasoning_content input

* report template caps to webui

* add docs

* rm commented code
2026-01-22 21:30:06 +01:00
Georgi Gerganov a5eaa1d6a3
mla : make the V tensor a view of K (#18986)
* mla : pass V as a view of K to the FA op

* cuda : adjust mla logic to new layout

* kv-cache : fix rope shift

* tests : remove comment

* cuda : fix reusable_cutoff

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>

---------

Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
2026-01-22 22:09:01 +02:00
Piotr Wilkin (ilintar) c301172f66
jinja: support none|string (#18995)
* jinja: support none|string

* Update common/jinja/value.cpp

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update tests/test-jinja.cpp

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Add as_string()

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-01-21 19:24:37 +01:00
Jeff Bolz 33f890e579
vulkan: support flash attention GQA/split_k with small batches (#18938) 2026-01-21 17:43:43 +01:00
Xuan-Son Nguyen 2c1f199653
cli : fix reasoning responses in CLI (#18961)
* cli : fix reasoning responses in CLI

* fix build

* fix build (2)
2026-01-20 18:23:25 +01:00
Sigbjørn Skjæret 959ecf7f23
jinja : fix undefined keys and attributes and int/float as bool (#18924)
* fix undefined keys and attributes

* add falsy tests

* as_bool for integers and floats

* more falsy/truthy tests

* --typo
2026-01-19 20:29:43 +01:00
Sigbjørn Skjæret 4037093c66
ci : run test-jinja -py on high perf [no ci] (#18916) 2026-01-19 20:29:15 +01:00
Xuan-Son Nguyen fe44d35574
tests : add test-jinja -py option for cross-checking (#18906)
* tests : add test-jinja -py option or cross-checking

* Update tests/test-jinja.cpp

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* fix + add source

* SandboxedEnvironment

* fix array.map case

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-01-18 08:14:27 +01:00
Sigbjørn Skjæret d03c45c9c5
jinja : attribute support for join, map and sort (#18883)
* support negative array index and default value

* attribute support (int and str) for join, map and sort

* add tests

* update CODEOWNERS

* improve fixme sorting comment
2026-01-18 02:53:01 +01:00
Sigbjørn Skjæret 10c98cbdf6
jinja : add missing tojson filter for bool (#18900)
* add missing tojson for bool

* add more literal tests
2026-01-18 01:05:09 +01:00