This commit updates the debug example to not create the base_callback_data. The motivation for this is when using `--save-logits`, which is used by examples/model-conversion scripts, we often don't care about the tensor outputs and they just add noise to the output. This changes is quiet by default we can always remove --save-logits to get the tensor outputs when debugging. |
||
|---|---|---|
| .. | ||
| CMakeLists.txt | ||
| README.md | ||
| debug.cpp | ||
README.md
llama.cpp/examples/debug
This is a utility intended to help debug a model by registering a callback that logs GGML operations and tensor data. It can also store the generated logits or embeddings as well as the prompt and token ids for comparison with the original model.
Usage
llama-debug \
--hf-repo ggml-org/models \
--hf-file phi-2/ggml-model-q4_0.gguf \
--model phi-2-q4_0.gguf \
--prompt hello \
--save-logits \
--verbose
The tensor data is logged as debug and required the --verbose flag. The reason
for this is that while useful for a model with many layers there can be a lot of
output. You can filter the tensor names using the --tensor-filter option.
A recommended approach is to first run without --verbose and see if the
generated logits/embeddings are close to the original model. If they are not,
then it might be required to inspect tensor by tensor and in that case it is
useful to enable the --verbose flag along with --tensor-filter to focus on
specific tensors.
Options
This example supports all standard llama.cpp options and also accepts the
following options:
$ llama-debug --help
...
----- example-specific params -----
--save-logits save final logits to files for verification (default: false)
--logits-output-dir PATH directory for saving logits output files (default: data)
--tensor-filter REGEX filter tensor names for debug output (regex pattern, can be specified multiple times)
Output Files
When --save-logits is enabled, the following files are created in the output
directory:
llamacpp-<model>[-embeddings].bin- Binary output (logits or embeddings)llamacpp-<model>[-embeddings].txt- Text output (logits or embeddings, one per line)llamacpp-<model>[-embeddings]-prompt.txt- Prompt text and token IDsllamacpp-<model>[-embeddings]-tokens.bin- Binary token IDs for programmatic comparison
These files can be compared against the original model's output to verify the converted model.