llama.cpp/ggml
Srihari-mcw baad94885d
ggml : Q2k interleaving implementation - x86/x64 SIMD (#14373)
* Initial Q2_K Block Interleaving Implementation

* Addressed review comments and clean up of the code

* Post rebase fixes

* Initial CI/CD fixes

* Update declarations in arch-fallback.h

* Changes for GEMV Q2_K in arch-fallback.h

* Enable repacking only on AVX-512 machines

* Update comments in repack.cpp

* Address q2k comments

---------

Co-authored-by: Manogna-Sree <elisetti.manognasree@multicorewareinc.com>
2025-08-01 09:20:33 +03:00
..
cmake cmake : Fix BLAS link interface (ggml/1316) 2025-07-30 17:33:11 +03:00
include ggml: Add initial WebGPU backend (#14521) 2025-07-16 18:18:51 +03:00
src ggml : Q2k interleaving implementation - x86/x64 SIMD (#14373) 2025-08-01 09:20:33 +03:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt HIP: add GGML_HIP_MMQ_MFMA option to allow disableing the MFMA path. (#14930) 2025-07-29 17:44:30 +02:00