llama.cpp

mirror of https://github.com/ggerganov/llama.cpp.git

History

Gaurav Garg d1fd632ab8 Use the new kernel only for nblocks_stream_k_raw > 4 * ntiles_dst to make sure we have enough concurrency on GPUs		2026-03-30 12:22:43 +05:30
..
cmake	ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094 )	2025-08-07 13:45:41 +02:00
include	llama: fix llama-model-saver (#20503 )	2026-03-25 12:53:16 +02:00
src	Use the new kernel only for nblocks_stream_k_raw > 4 * ntiles_dst to make sure we have enough concurrency on GPUs	2026-03-30 12:22:43 +05:30
.gitignore	…
CMakeLists.txt	ggml : bump version to 0.9.8 (ggml/1442)	2026-03-18 15:17:28 +02:00