llama.cpp/ggml
Jeff Bolz 879d673759
vulkan: Implement top-k (#17418)
* vulkan: Implement top-k

Each pass launches workgroups that each sort 2^N elements (where N is usually 7-10)
and discards all but the top K. Repeat until only K are left. And there's a fast
path when K==1 to just find the max value rather than sorting.

* fix pipeline selection

* vulkan: Add N-ary search algorithm for topk

* microoptimizations
2025-11-26 16:45:43 +01:00
..
cmake ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094) 2025-08-07 13:45:41 +02:00
include ggml : add ggml_top_k (#17365) 2025-11-25 15:31:43 +02:00
src vulkan: Implement top-k (#17418) 2025-11-26 16:45:43 +01:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt ggml : remove dirty flag from version string (ggml/1391) 2025-11-24 15:26:31 +02:00