llama.cpp/ggml
hipudding cb15cdb020 CANN: add SOFTPLUS unary op support
Implement GGML_UNARY_OP_SOFTPLUS using aclnnSoftplus with beta=1.0
and threshold=20.0. This enables hybrid models like Qwen3.5 to run
entirely on the CANN backend without graph splitting, which fixes
graph cache instability caused by the backend scheduler fragmenting
the computation graph when SOFTPLUS falls back to CPU.
2026-03-28 07:16:07 +00:00
..
cmake ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094) 2025-08-07 13:45:41 +02:00
include llama: fix llama-model-saver (#20503) 2026-03-25 12:53:16 +02:00
src CANN: add SOFTPLUS unary op support 2026-03-28 07:16:07 +00:00
.gitignore
CMakeLists.txt ggml : bump version to 0.9.8 (ggml/1442) 2026-03-18 15:17:28 +02:00