Update README.md
This commit is contained in:
parent
dcac206f8e
commit
89051cda35
|
|
@ -10,7 +10,7 @@ More information is available in <https://github.com/ggml-org/llama.cpp/pull/486
|
|||
-m model.gguf -f some-text.txt [-o imatrix.gguf] [--output-format {gguf,dat}] [--no-ppl] \
|
||||
[--process-output] [--chunk 123] [--save-frequency 0] [--output-frequency 10] \
|
||||
[--in-file imatrix-prev-0.gguf --in-file imatrix-prev-1.gguf ...] [--parse-special] \
|
||||
[--show-statistics] [...]
|
||||
[--activation-statistics] [--show-statistics] [...]
|
||||
```
|
||||
|
||||
Here `-m | --model` with a model name and `-f | --file` with a file containing calibration data (such as e.g. `wiki.train.raw`) are mandatory.
|
||||
|
|
@ -29,6 +29,7 @@ The parameters in square brackets are optional and have the following meaning:
|
|||
* `--chunks` maximum number of chunks to process. Default is `-1` for all available chunks.
|
||||
* `--no-ppl` disables the calculation of perplexity for the processed chunks. Useful if you want to speed up the processing and do not care about perplexity.
|
||||
* `--show-statistics` displays imatrix file's statistics.
|
||||
* `--activation-statistics` enables the collection of activation statistics for each tensor. If set, the imatrix file size will double, but reported statistics will be more accurate.
|
||||
|
||||
For faster computation, make sure to use GPU offloading via the `-ngl | --n-gpu-layers` argument.
|
||||
|
||||
|
|
@ -69,6 +70,11 @@ Versions **b5942** and newer of `llama-imatrix` store data in GGUF format by def
|
|||
./llama-imatrix -m ggml-model-f16.gguf -f calibration-data.txt --chunk 5 --output-frequency 20 --save-frequency 50 --parse-special
|
||||
```
|
||||
|
||||
```bash
|
||||
# generate imatrix and enable activation-based statistics
|
||||
./llama-imatrix -m ggml-model-f16.gguf -f calibration-data.txt --activation-statistics -ngl 99
|
||||
```
|
||||
|
||||
```bash
|
||||
# analyse imatrix file and display summary statistics instead of running inference
|
||||
./llama-imatrix --in-file imatrix.gguf --show-statistics
|
||||
|
|
@ -76,7 +82,7 @@ Versions **b5942** and newer of `llama-imatrix` store data in GGUF format by def
|
|||
|
||||
## Statistics
|
||||
|
||||
From version <bwxyz>, `--show-statistics` operates in two modes: for GGUF (preferred) imatrices, it reports direct and accurate activation statistics, and for legacy (binary) files, it reports the less precise average squared activations.
|
||||
Beginning with version <bwxyz>, `--show-statistics` has two modes. If `--activation-statistics` was used at imatrix creation time and `--output-format` was set to `gguf`, it reports precise statistics. Otherwise, it reports less accurate, albeit still useful, metrics based on average squared activations.
|
||||
|
||||
#### Per tensor
|
||||
|
||||
|
|
|
|||
Loading…
Reference in New Issue