llama.cpp/ggml
Tim Burke b8e8d291d1 ggml: refactor x86 AVX2 and ARM NEON MXFP dequant — shared traits and helpers
Add mxfp_dequant_traits_t to ggml-common.h as single source of truth for
MXFP IEEE-754 reconstruction parameters. Define static const instances for
all 4 formats (E4M3, E5M2, E2M3, E3M2), ready for CUDA/Metal/Vulkan reuse.

Extract shared dequant and FP6 unpack helpers on both architectures,
replacing duplicated inline code and macros. Net -215 lines.
2026-03-15 21:37:02 -04:00
..
cmake ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094) 2025-08-07 13:45:41 +02:00
include ggml: MXFP flash attention with SoA layout (CPU scalar reference) 2026-03-15 17:33:19 -04:00
src ggml: refactor x86 AVX2 and ARM NEON MXFP dequant — shared traits and helpers 2026-03-15 21:37:02 -04:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt ggml : add OpenVINO backend (#15307) 2026-03-14 07:56:55 +02:00