llama.cpp/examples/model-conversion/scripts/utils
Daniel Bevenius 62cef26ac5
model-conversion : add qat-q4 quantization targets (#15588)
This commit adds two targets to the Makefile for quantizing of
Quantization Aware Trained (QAT) models to Q4_0 format.

The motivation for this is that this sets the token embedding and the
output tensors data types to Q8_0 instead of the default Q6_K. This is
someting that we wish to enforce for QAT Q4_0 models that are to be
uploaded to ggml-org on Huggingface to guarantee the best quality.
2025-08-26 16:12:29 +02:00
..
check-nmse.py examples : add model conversion tool/example (#15455) 2025-08-21 12:16:54 +02:00
create-collection-add-model.sh examples : add model conversion tool/example (#15455) 2025-08-21 12:16:54 +02:00
hf-add-model-to-collection.py examples : add model conversion tool/example (#15455) 2025-08-21 12:16:54 +02:00
hf-create-collection.py examples : add model conversion tool/example (#15455) 2025-08-21 12:16:54 +02:00
hf-create-model.py model-conversion : add model card template for embeddings [no ci] (#15557) 2025-08-25 14:25:25 +02:00
hf-upload-gguf-model.py examples : add model conversion tool/example (#15455) 2025-08-21 12:16:54 +02:00
inspect-converted-model.sh examples : add model conversion tool/example (#15455) 2025-08-21 12:16:54 +02:00
inspect-org-model.py examples : add model conversion tool/example (#15455) 2025-08-21 12:16:54 +02:00
perplexity-gen.sh examples : add model conversion tool/example (#15455) 2025-08-21 12:16:54 +02:00
perplexity-run-simple.sh examples : add model conversion tool/example (#15455) 2025-08-21 12:16:54 +02:00
perplexity-run.sh examples : add model conversion tool/example (#15455) 2025-08-21 12:16:54 +02:00
quantize.sh model-conversion : add qat-q4 quantization targets (#15588) 2025-08-26 16:12:29 +02:00
run-embedding-server.sh examples : add model conversion tool/example (#15455) 2025-08-21 12:16:54 +02:00
semantic_check.py examples : add model conversion tool/example (#15455) 2025-08-21 12:16:54 +02:00