llama.cpp/ggml
Tim Burke a51ff77fae ggml: address PR review — fix buffer overflows, add assertions, normalize MXFP6 naming
Fix potential buffer overflows flagged in PR #20609 review:
- set_rows: replace fixed float tmp[1024] with std::vector for large n_embd_k_gqa
- tiled FA: size q_mxfp_buf with ggml_row_size guard instead of fixed 1024
- one_chunk FA: pre-allocate k/v dequant buffers from mxfp.{k,v}_soa_elems
  instead of hard-coded float[4096] stack arrays
- kv-cache: assert n_embd_k_gqa % qk == 0 before integer division
- test init: assert soa_bytes % block_size == 0

Normalize MXFP6 function naming to match MXFP8 convention (short form
without element format suffix): mxfp6_e2m3 → mxfp6 in all function
identifiers across 14 files. Format-specific items (type enums, traits,
lookup tables, constants) retain their _e2m3 suffix.
2026-03-15 18:57:50 -04:00
..
cmake ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094) 2025-08-07 13:45:41 +02:00
include ggml: MXFP flash attention with SoA layout (CPU scalar reference) 2026-03-15 17:33:19 -04:00
src ggml: address PR review — fix buffer overflows, add assertions, normalize MXFP6 naming 2026-03-15 18:57:50 -04:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt ggml : add OpenVINO backend (#15307) 2026-03-14 07:56:55 +02:00