llama.cpp

History

hksdpc255 1920345c3b common : Generalized XML-style tool-call parsing with streaming support (GLM 4.5/4.6 + MiniMax M2 + SeedOSS + Kimi-K2 + Qwen3-Coder + Apriel-1.5 + Xiaomi-MiMo) (#16932 ) * Add files via upload * fix unit test * fix crashes for --reasoning-format=none * Patch buggy official MiniMax-M2 chat template * add upstream minja fix: https://github.com/ochafik/minja/pull/7 * Fix <think> token not generated * add test copied from https://github.com/ggml-org/llama.cpp/pull/16946 * cleanup * Hopes to fix the compilation error on CI * Delete chat template patching since it’s fixed by upstream Minja * Remove undeeded Minimax-M2 template patch https://github.com/ochafik/minja/pull/7#issuecomment-3480356100 * Add proper handling of optional parameters with test merged tests from: `23d4bb75c4` * Fix making all tool parameters optional * Move xml tool parser to separate file * cleanup & add tests for GLM4.5 * add streaming tests & enhancement & cleanups Add streaming test for both GLM 4.5 and minimax-m2. Cleanup for preserved_tokens. Cleanup for grammar rule name. Enhance the parser's stability. * cleanup & add support for Kimi-K2 Qwen3-Coder Apriel-1.5 Xiaomi-MiMo * apply suggestions from reviewers * fix a misuse for data.grammar_lazy * fix grammar when tool have no argument * Fix `no triggers set for lazy grammar!` for GLM4.5/4.6. Insert additional stops for Kimi-K2 * update chat.cpp * fix grammar for GLM 4.5/4.6 * Try fix Jinja template for GLM * Try fix GLM-4.6.jinja * Update common/chat-parser-xml-toolcall.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * Update tests/test-chat.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> * improve chat template for GLM, rename Kimi-K2 template to Kimi-K2-Thinking * Improve Kimi-K2 chat template * Fix unit test * Fix "Invalid tool call arguments passed" in a rare case. In a rare case, the model may emit a raw string that begins with a valid JSON string. This commit adds unit tests to cover that scenario and fixes the regression introduced during the Kimi-K2 adaptation. --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>		2025-11-18 18:54:15 +01:00
..
templates	common : Generalized XML-style tool-call parsing with streaming support (GLM 4.5/4.6 + MiniMax M2 + SeedOSS + Kimi-K2 + Qwen3-Coder + Apriel-1.5 + Xiaomi-MiMo) (#16932 )	2025-11-18 18:54:15 +01:00
.editorconfig	gguf : new file format with flexible meta data (beta) (#2398 )	2023-08-21 23:07:43 +03:00
ggml-vocab-aquila.gguf	Work on the BPE tokenizer (#3252 )	2023-10-03 09:16:26 +02:00
ggml-vocab-baichuan.gguf	Add more tokenizer tests (#3742 )	2023-10-24 09:17:17 +02:00
ggml-vocab-bert-bge.gguf	llama : fix BPE pre-tokenization (#6920 )	2024-04-29 16:58:41 +03:00
ggml-vocab-bert-bge.gguf.inp	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-bert-bge.gguf.out	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-command-r.gguf	command-r : add BPE pre-tokenization (#7063 )	2024-05-05 08:19:30 +03:00
ggml-vocab-command-r.gguf.inp	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-command-r.gguf.out	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-deepseek-coder.gguf	llama : fix BPE pre-tokenization (#6920 )	2024-04-29 16:58:41 +03:00
ggml-vocab-deepseek-coder.gguf.inp	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-deepseek-coder.gguf.out	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-deepseek-llm.gguf	llama : fix BPE pre-tokenization (#6920 )	2024-04-29 16:58:41 +03:00
ggml-vocab-deepseek-llm.gguf.inp	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-deepseek-llm.gguf.out	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-falcon.gguf	llama : fix BPE pre-tokenization (#6920 )	2024-04-29 16:58:41 +03:00
ggml-vocab-falcon.gguf.inp	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-falcon.gguf.out	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-gpt-2.gguf	llama : fix BPE pre-tokenization (#6920 )	2024-04-29 16:58:41 +03:00
ggml-vocab-gpt-2.gguf.inp	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-gpt-2.gguf.out	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-gpt-neox.gguf	Add more tokenizer tests (#3742 )	2023-10-24 09:17:17 +02:00
ggml-vocab-llama-bpe.gguf	llama : fix BPE pre-tokenization (#6920 )	2024-04-29 16:58:41 +03:00
ggml-vocab-llama-bpe.gguf.inp	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-llama-bpe.gguf.out	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-llama-spm.gguf	llama : fix BPE pre-tokenization (#6920 )	2024-04-29 16:58:41 +03:00
ggml-vocab-llama-spm.gguf.inp	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-llama-spm.gguf.out	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-mpt.gguf	llama : fix BPE pre-tokenization (#6920 )	2024-04-29 16:58:41 +03:00
ggml-vocab-mpt.gguf.inp	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-mpt.gguf.out	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-nomic-bert-moe.gguf	tests : improve UGM tokenizer test coverage (#13773 )	2025-05-25 16:22:29 +02:00
ggml-vocab-phi-3.gguf	Per token attributes (#7685 )	2024-06-04 09:17:17 +02:00
ggml-vocab-phi-3.gguf.inp	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-phi-3.gguf.out	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-qwen2.gguf	llama : add BPE pre-tokenization for Qwen2 (#7114 )	2024-05-08 15:06:43 +03:00
ggml-vocab-qwen2.gguf.inp	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-qwen2.gguf.out	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-refact.gguf	tests : add test-tokenizer-0.sh + fix some tokenizers (#7036 )	2024-05-04 08:32:32 +03:00
ggml-vocab-refact.gguf.inp	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-refact.gguf.out	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-starcoder.gguf	llama : fix BPE pre-tokenization (#6920 )	2024-04-29 16:58:41 +03:00
ggml-vocab-starcoder.gguf.inp	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00
ggml-vocab-starcoder.gguf.out	convert : allow partial update to the chkhsh pre-tokenizer list (#13847 )	2025-05-30 12:24:37 +02:00