llama.cpp

History

Jeff Bolz 66d7c14359 vulkan: split mul_mat into multiple dispatches to avoid overflow The batch dimensions can be greater than the max workgroup count limit, in which case we need to split into multiple dispatches and pass the base index through a push constant. Fall back for the less common p021 and nc variants.		2026-02-11 10:21:41 -06:00
..
cmake	ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094 )	2025-08-07 13:45:41 +02:00
include	ggml-virtgpu: make the code thread safe (#19204 )	2026-02-04 10:46:18 +08:00
src	vulkan: split mul_mat into multiple dispatches to avoid overflow	2026-02-11 10:21:41 -06:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	Bump cmake max version (needed for Windows on Snapdragon builds) (#19188 )	2026-02-01 14:13:38 -08:00