llama.cpp/ggml
Progeny Alpha 088cb0cbe8 vulkan: harden chunked GDN dispatch and fix minor issues
- Raise GDN_CHUNK_THRESHOLD from 2 to CHUNK_SIZE (64). Chunked path
  only activates when there's at least one full chunk. Below that,
  autoregressive is faster and the 3-dispatch overhead isn't justified.
- Add maxStorageBufferRange guard on scratch allocation. Falls back to
  autoregressive if the scratch buffers would exceed device limits.
- Fix inaccurate shared memory stride comment in cm1 output kernel.

16/16 tests pass.
2026-03-15 00:38:44 -04:00
..
cmake ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094) 2025-08-07 13:45:41 +02:00
include llama : enable chunked fused GDN path (#20340) 2026-03-11 22:46:40 +02:00
src vulkan: harden chunked GDN dispatch and fix minor issues 2026-03-15 00:38:44 -04:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt ggml : fix typo gmml (#20512) 2026-03-13 14:36:13 +01:00