llama.cpp/ggml
Jeff Bolz e262bf9798 vulkan/cuda: fix topk_moe with exp_probs_b
I updated test_topk_moe to more closely match llm_graph_context::build_moe_ffn
and added coverage for exp_probs_b and some other missing combinations. This
exposed a bug in both CUDA and Vulkan backends where they were assuming the
input to argsort and the input to get_rows are the same. I'd like to optimize
this graph in another change, but for now just get it functional.

CUDA also had a bug where it got n_experts from the wrong place, leading to
GGML_ASSERT failures in some of the new tests.
2025-12-15 20:03:42 -06:00
..
cmake ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094) 2025-08-07 13:45:41 +02:00
include ggml-cpu : fix RISC-V Q4_0 repack select and RVV feature reporting (#17951) 2025-12-12 16:26:03 +02:00
src vulkan/cuda: fix topk_moe with exp_probs_b 2025-12-15 20:03:42 -06:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt cmake : set `CMAKE_RUNTIME_OUTPUT_DIRECTORY` for non standalone build (ggml/1394) 2025-12-14 08:33:51 +02:00