llama.cpp

History

Jeff Bolz 879d673759 vulkan: Implement top-k (#17418 ) * vulkan: Implement top-k Each pass launches workgroups that each sort 2^N elements (where N is usually 7-10) and discards all but the top K. Repeat until only K are left. And there's a fast path when K==1 to just find the max value rather than sorting. * fix pipeline selection * vulkan: Add N-ary search algorithm for topk * microoptimizations		2025-11-26 16:45:43 +01:00
..
cmake	ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094 )	2025-08-07 13:45:41 +02:00
include	ggml : add ggml_top_k (#17365 )	2025-11-25 15:31:43 +02:00
src	vulkan: Implement top-k (#17418 )	2025-11-26 16:45:43 +01:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	ggml : remove dirty flag from version string (ggml/1391)	2025-11-24 15:26:31 +02:00