llama.cpp/examples
Daniel Bevenius 62cef26ac5
model-conversion : add qat-q4 quantization targets (#15588)
This commit adds two targets to the Makefile for quantizing of
Quantization Aware Trained (QAT) models to Q4_0 format.

The motivation for this is that this sets the token embedding and the
output tensors data types to Q8_0 instead of the default Q6_K. This is
someting that we wish to enforce for QAT Q4_0 models that are to be
uploaded to ggml-org on Huggingface to guarantee the best quality.
2025-08-26 16:12:29 +02:00
..
batched
batched.swift examples : remove references to `make` in examples [no ci] (#15457) 2025-08-21 06:12:28 +02:00
convert-llama2c-to-ggml
deprecation-warning
diffusion
embedding
eval-callback
gen-docs
gguf
gguf-hash
gritlm
jeopardy
llama.android
llama.swiftui
lookahead lookahead : add sample command to readme (#15447) 2025-08-20 13:30:46 +03:00
lookup
model-conversion model-conversion : add qat-q4 quantization targets (#15588) 2025-08-26 16:12:29 +02:00
parallel
passkey examples : remove references to `make` in examples [no ci] (#15457) 2025-08-21 06:12:28 +02:00
retrieval examples : remove references to `make` in examples [no ci] (#15457) 2025-08-21 06:12:28 +02:00
save-load-state
simple
simple-chat
simple-cmake-pkg
speculative
speculative-simple
sycl examples : remove references to `make` in examples [no ci] (#15457) 2025-08-21 06:12:28 +02:00
training
CMakeLists.txt examples : add model conversion tool/example (#15455) 2025-08-21 12:16:54 +02:00
Miku.sh
chat-13B.bat
chat-13B.sh
chat-persistent.sh
chat-vicuna.sh
chat.sh
convert_legacy_llama.py
json_schema_pydantic_example.py
json_schema_to_grammar.py
llama.vim llama : remove KV cache defragmentation logic (#15473) 2025-08-22 12:22:13 +03:00
llm.vim
pydantic_models_to_grammar.py
pydantic_models_to_grammar_examples.py
reason-act.sh
regex_to_grammar.py
server-llama2-13B.sh
server_embd.py
ts-type-to-grammar.sh