This program verifies that a given gguf model file can tokenize all potential valid characters. Since llama.cpp currently raises an exception when tokenization is not possible[1], this tool helps verifying that valid ascii and utf-8 will always be properly tokenized. [1] https://github.com/ggerganov/llama.cpp/issues/2580 |
||
|---|---|---|
| .. | ||
| CMakeLists.txt | ||
| tokenizer-verifier.cpp | ||