llama.cpp/ggml
Jeff Bolz 1384abf8b8
vulkan: handle mat_mul with A matrix > 4GB (#16176)
* vulkan: handle mat_mul with A matrix > 4GB

This change splits mat_mul operations with huge A matrix into chunks in the M
dimension. This works well for stable-diffusion use cases where the im2col
matrix has very large M.

Fix the order of setting the stride in mul_mm_cm2 - setting the dimension
clobbers the stride, so stride should be set after.

* build fixes
2025-09-27 20:36:34 -05:00
..
cmake ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094) 2025-08-07 13:45:41 +02:00
include llama: print memory breakdown on exit (#15860) 2025-09-24 16:53:48 +02:00
src vulkan: handle mat_mul with A matrix > 4GB (#16176) 2025-09-27 20:36:34 -05:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt common : use cpp-httplib as a cURL alternative for downloads (#16185) 2025-09-26 14:12:19 +03:00