llama.cpp

History

hipudding cb15cdb020 CANN: add SOFTPLUS unary op support Implement GGML_UNARY_OP_SOFTPLUS using aclnnSoftplus with beta=1.0 and threshold=20.0. This enables hybrid models like Qwen3.5 to run entirely on the CANN backend without graph splitting, which fixes graph cache instability caused by the backend scheduler fragmenting the computation graph when SOFTPLUS falls back to CPU.		2026-03-28 07:16:07 +00:00
..
cmake	ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094 )	2025-08-07 13:45:41 +02:00
include	llama: fix llama-model-saver (#20503 )	2026-03-25 12:53:16 +02:00
src	CANN: add SOFTPLUS unary op support	2026-03-28 07:16:07 +00:00
.gitignore	…
CMakeLists.txt	ggml : bump version to 0.9.8 (ggml/1442)	2026-03-18 15:17:28 +02:00