llama.cpp/ggml/src/ggml-cpu/arch
Alberto Cabrera Pérez 091a46cb8d
ggml-cpu: aarm64: q5_K repack gemm and gemv (and generic) implementations (i8mm) (#18860)
* Boilerplate for q5_Kx8 REPACK on ARM and fallback

Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>

* Implements make_block_q5_Kx8 by extending make_block_q4_Kx8

Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>

* q5_K repack gemm and gemv generics

* Gemm and Gemv ARM implementations (i8mm)

* Improved qh manipulation looking at non-repack vec_dot implementation

* Full unroll

* Apply Q5_K Gemv vand and vshl optimizations to gemm. Improve comments.

Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>

* Fix wrong fallback definitions of Q5_K

Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>

* Fixed comments. Reverted unnecessary formatting

Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>

* Fixed typo in generic definitions

* Switching AND + Shift with Shift Insert. Better op interleaving.

* Vectorize + unroll the block scales

* Apply gemm optimizations to gemv

* Improve bias calculation

---------

Signed-off-by: Alberto Cabrera <alberto.cabrera@liquid.ai>
2026-01-23 09:55:08 +02:00
..
arm ggml-cpu: aarm64: q5_K repack gemm and gemv (and generic) implementations (i8mm) (#18860) 2026-01-23 09:55:08 +02:00
loongarch ggml : LoongArch fixes (#16958) 2025-11-03 08:40:02 +02:00
powerpc ggml-cpu: add mxfp4 VSX intrinsics for Power9+ (ppc64le) hardware (#15385) 2025-08-19 11:54:31 +03:00
riscv ggml: replace hwcap with riscv_hwprobe for RVV detection (#17567) 2025-11-29 14:56:31 +02:00
s390 ggml: add s390x cpu-feats (#16774) 2025-11-02 08:48:23 +08:00
wasm ggml-cpu : deduplicate scalar implementations (#14897) 2025-07-28 17:40:24 +02:00
x86 ggml : add missing AVX512 feature checks (#17270) 2025-11-17 12:12:00 +01:00