llama.cpp

History

Max Krasnyansky 609ea50026 hexagon: Q4_0 and MXFP4 repack fixes (#20527 ) * hexagon: fix tail corruption with rows sizes not multiple of 256 * hexagon: use different stride for repacking partial blocks * hex-mm: update repack and kernels to avoid shuffles for full 256-element blocks Previous commit changed the repacking to use even:odd (0:1,2:3,..) packing instead of the original (0:128,1:129,...) packing in order to fix tail corruption. Since the mm kernels already deal with partial tails we can use even:odd packing only for the last block. This avoid performance penalty of having to shuffle to zip the elements in the common case. * hex-mm: update rmpy x8 for better optimizations * hex-mm: tighten supported MUL_MAT checks to avoid spurios failures * hex-mm: use vzero to init accumulators * hex-mm: properly call partial rmpy_x8		2026-03-14 11:09:08 -07:00
..
cmake	ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094 )	2025-08-07 13:45:41 +02:00
include	ggml : add OpenVINO backend (#15307 )	2026-03-14 07:56:55 +02:00
src	hexagon: Q4_0 and MXFP4 repack fixes (#20527 )	2026-03-14 11:09:08 -07:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	ggml : add OpenVINO backend (#15307 )	2026-03-14 07:56:55 +02:00