llama.cpp

History

Jeff Bolz 1384abf8b8 vulkan: handle mat_mul with A matrix > 4GB (#16176 ) * vulkan: handle mat_mul with A matrix > 4GB This change splits mat_mul operations with huge A matrix into chunks in the M dimension. This works well for stable-diffusion use cases where the im2col matrix has very large M. Fix the order of setting the stride in mul_mm_cm2 - setting the dimension clobbers the stride, so stride should be set after. * build fixes		2025-09-27 20:36:34 -05:00
..
cmake	ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094 )	2025-08-07 13:45:41 +02:00
include	llama: print memory breakdown on exit (#15860 )	2025-09-24 16:53:48 +02:00
src	vulkan: handle mat_mul with A matrix > 4GB (#16176 )	2025-09-27 20:36:34 -05:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	common : use cpp-httplib as a cURL alternative for downloads (#16185 )	2025-09-26 14:12:19 +03:00