llama.cpp

History

Tim Burke b8e8d291d1 ggml: refactor x86 AVX2 and ARM NEON MXFP dequant — shared traits and helpers Add mxfp_dequant_traits_t to ggml-common.h as single source of truth for MXFP IEEE-754 reconstruction parameters. Define static const instances for all 4 formats (E4M3, E5M2, E2M3, E3M2), ready for CUDA/Metal/Vulkan reuse. Extract shared dequant and FP6 unpack helpers on both architectures, replacing duplicated inline code and macros. Net -215 lines.		2026-03-15 21:37:02 -04:00
..
cmake	ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094 )	2025-08-07 13:45:41 +02:00
include	ggml: MXFP flash attention with SoA layout (CPU scalar reference)	2026-03-15 17:33:19 -04:00
src	ggml: refactor x86 AVX2 and ARM NEON MXFP dequant — shared traits and helpers	2026-03-15 21:37:02 -04:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	ggml : add OpenVINO backend (#15307 )	2026-03-14 07:56:55 +02:00