llama.cpp/examples/debug
Marcel Petrick 92f7da00b4
chore : correct typos [no ci] (#20041)
* fix(docs): correct typos found during code review

Non-functional changes only:
- Fixed minor spelling mistakes in comments
- Corrected typos in user-facing strings
- No variables, logic, or functional code was modified.

Signed-off-by: Marcel Petrick <mail@marcelpetrick.it>

* Update docs/backend/CANN.md

Co-authored-by: Aaron Teo <taronaeo@gmail.com>

* Revert "Auxiliary commit to revert individual files from 846d1c301281178efbc6ce6060ad34c1ebe45af8"

This reverts commit 02fcf0c7db661d5ff3eff96b2b2db9fdb7213256.

* Update tests/test-backend-ops.cpp

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

* Update tests/test-backend-ops.cpp

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

---------

Signed-off-by: Marcel Petrick <mail@marcelpetrick.it>
Co-authored-by: Aaron Teo <taronaeo@gmail.com>
Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
2026-03-05 08:50:21 +01:00
..
CMakeLists.txt examples : add debug utility/example (#18464) 2026-01-07 10:42:19 +01:00
README.md chore : correct typos [no ci] (#20041) 2026-03-05 08:50:21 +01:00
debug.cpp Restore clip's cb() to its rightful glory - extract common debugging elements in llama (#17914) 2026-01-14 20:29:35 +01:00

README.md

llama.cpp/examples/debug

This is a utility intended to help debug a model by registering a callback that logs GGML operations and tensor data. It can also store the generated logits or embeddings as well as the prompt and token ids for comparison with the original model.

Usage

llama-debug \
  --hf-repo ggml-org/models \
  --hf-file phi-2/ggml-model-q4_0.gguf \
  --model phi-2-q4_0.gguf \
  --prompt hello \
  --save-logits \
  --verbose

The tensor data is logged as debug and required the --verbose flag. The reason for this is that while useful for a model with many layers there can be a lot of output. You can filter the tensor names using the --tensor-filter option.

A recommended approach is to first run without --verbose and see if the generated logits/embeddings are close to the original model. If they are not, then it might be required to inspect tensor by tensor and in that case it is useful to enable the --verbose flag along with --tensor-filter to focus on specific tensors.

Options

This example supports all standard llama.cpp options and also accepts the following options:

$ llama-debug --help
...

----- example-specific params -----

--save-logits                           save final logits to files for verification (default: false)
--logits-output-dir PATH                directory for saving logits output files (default: data)
--tensor-filter REGEX                   filter tensor names for debug output (regex pattern, can be specified multiple times)

Output Files

When --save-logits is enabled, the following files are created in the output directory:

  • llamacpp-<model>[-embeddings].bin - Binary output (logits or embeddings)
  • llamacpp-<model>[-embeddings].txt - Text output (logits or embeddings, one per line)
  • llamacpp-<model>[-embeddings]-prompt.txt - Prompt text and token IDs
  • llamacpp-<model>[-embeddings]-tokens.bin - Binary token IDs for programmatic comparison

These files can be compared against the original model's output to verify the converted model.