llama.cpp

History

mkoker edd4d9bca5 vulkan: add FA dequant for q4_1, q5_0, q5_1, iq4_nl (#21029 ) Add dequantize4() implementations for Q4_1, Q5_0, Q5_1, and IQ4_NL in the flash attention base shader. Register them in the shader generator, pipeline creation, and enable in the scalar/coopmat1 FA support check.		2026-04-07 13:41:29 +02:00
..
cmake	ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094 )	2025-08-07 13:45:41 +02:00
include	ggml: add Q1_0 1-bit quantization support (CPU) (#21273 )	2026-04-06 20:55:21 +02:00
src	vulkan: add FA dequant for q4_1, q5_0, q5_1, iq4_nl (#21029 )	2026-04-07 13:41:29 +02:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	ggml : bump version to 0.9.11 (ggml/1456)	2026-04-02 10:39:00 +03:00