llama.cpp/ggml
Alberto Cabrera Pérez cd8370b408
ggml-cpu: aarm64: q4_K repack gemm and gemv implementations (dotprod only) (#17494)
* Enabled q4_K_4x8 path

* Fixed generic Q4_K 8x4 implementation

* wip: dotprod gemm

* Working arm q4_K dotprod gemm

Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>

* Undo acc rename

Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>

* Q4_K arm dotprod gemm

Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>

* Fix: q4_qs reinterpret from uint to int

Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>

* Removed comments

* Fixed macro guards

* Fixed unused vars in generic implementation

* Fixed unused vars in 8x4 repack

* Fixed unused vars in generic implementation, unneeded comment

* Missing arch fallback for x86

* minor : style

---------

Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2025-11-27 13:25:14 +02:00
..
cmake ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094) 2025-08-07 13:45:41 +02:00
include ggml : add ggml_top_k (#17365) 2025-11-25 15:31:43 +02:00
src ggml-cpu: aarm64: q4_K repack gemm and gemv implementations (dotprod only) (#17494) 2025-11-27 13:25:14 +02:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt ggml : remove dirty flag from version string (ggml/1391) 2025-11-24 15:26:31 +02:00