llama.cpp

History

Aman Gupta 55a1c5a5fd CUDA: add softmax broadcast (#14475 ) * CUDA: add softmax broadcast * Pass by const ref * Review: Use blockDims for indexing, remove designated initializers * Add TODO for noncontigous input/output		2025-07-02 15:48:33 +03:00
..
cmake	ggml-cpu : rework weak alias on apple targets (#14146 )	2025-06-16 13:54:15 +08:00
include	ggml : support bcast ggml_soft_max_ext, ggml_flash_attn_ext (#14435 )	2025-07-02 15:48:33 +03:00
src	CUDA: add softmax broadcast (#14475 )	2025-07-02 15:48:33 +03:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	ggml-cpu: enable IBM NNPA Vector Intrinsics (#14317 )	2025-06-25 23:49:04 +02:00