llama.cpp/ggml/src/ggml-cann
Chenguang Li 2241453252
CANN: add support for ACL Graph (#15065)
* feat(cann): add optional support for ACL Graph execution

This commit adds support for executing ggml computational graphs using
Huawei's ACL graph mode via the USE_CANN_GRAPH flag. The support can be
enabled at compile time using the CMake option:

    -DUSE_CANN_GRAPH=ON

By default, ACL graph execution is **disabled**, and the fallback path
uses node-by-node execution.

Key additions:
- CMake option  to toggle graph mode
- Graph capture and execution logic using
- Tensor property matching to determine whether graph update is required
- Safe fallback and logging if the environment variable LLAMA_SET_ROWS
  is unset or invalid

This prepares the backend for performance improvements in repetitive graph
execution scenarios on Ascend devices.

Signed-off-by: noemotiovon <757486878@qq.com>

* Fix review comments

Signed-off-by: noemotiovon <757486878@qq.com>

* remane USE_CANN_GRAPH to USE_ACL_GRAPH

Signed-off-by: noemotiovon <757486878@qq.com>

* fix typo

Signed-off-by: noemotiovon <757486878@qq.com>

---------

Signed-off-by: noemotiovon <757486878@qq.com>
2025-08-06 14:12:42 +08:00
..
CMakeLists.txt CANN: add support for ACL Graph (#15065) 2025-08-06 14:12:42 +08:00
Doxyfile CANN: Add the basic supports of Flash Attention kernel (#13627) 2025-05-26 10:20:18 +08:00
acl_tensor.cpp CANN: Implement GLU ops (#14884) 2025-07-26 17:56:18 +08:00
acl_tensor.h CANN: Add the basic supports of Flash Attention kernel (#13627) 2025-05-26 10:20:18 +08:00
aclnn_ops.cpp CANN: Improve loading efficiency after converting weights to NZ format. (#14985) 2025-07-31 19:47:20 +08:00
aclnn_ops.h CANN: Add ggml_set_rows (#14943) 2025-07-29 22:36:43 +08:00
common.h CANN: add support for ACL Graph (#15065) 2025-08-06 14:12:42 +08:00
ggml-cann.cpp CANN: add support for ACL Graph (#15065) 2025-08-06 14:12:42 +08:00