llama.cpp

History

Progeny Alpha d2fabedf09 vulkan: fix chunked inter kernel state layout for PR #20443 PR #20443 removed redundant state transposes from the graph and updated the autoregressive shader to use colS_V+i (coalesced) instead of iS_V+col (strided). The chunked inter kernel was not updated, causing uncoalesced state reads and a ~8% PP regression. Fix state_in load and final_out write to match the new layout. h_snapshots (h_out/h_in) are internal scratch and keep their existing layout since inter and output kernels agree. PP-512: 202 → 218 t/s. 16/16 tests pass.		2026-03-13 23:34:59 -04:00
..
cmake	ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094 )	2025-08-07 13:45:41 +02:00
include	llama : enable chunked fused GDN path (#20340 )	2026-03-11 22:46:40 +02:00
src	vulkan: fix chunked inter kernel state layout for PR #20443	2026-03-13 23:34:59 -04:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	ggml : fix typo gmml (#20512 )	2026-03-13 14:36:13 +01:00