llama.cpp/ggml
Gaurav Garg d1fd632ab8 Use the new kernel only for nblocks_stream_k_raw > 4 * ntiles_dst to make sure we have enough concurrency on GPUs 2026-03-30 12:22:43 +05:30
..
cmake ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094) 2025-08-07 13:45:41 +02:00
include llama: fix llama-model-saver (#20503) 2026-03-25 12:53:16 +02:00
src Use the new kernel only for nblocks_stream_k_raw > 4 * ntiles_dst to make sure we have enough concurrency on GPUs 2026-03-30 12:22:43 +05:30
.gitignore
CMakeLists.txt ggml : bump version to 0.9.8 (ggml/1442) 2026-03-18 15:17:28 +02:00