llama.cpp/ggml/src/ggml-cann
hipudding c0389dba43
CANN: Disable acl_graph for prefill stage (#15933)
Since the prefill length is not fixed, graphs constructed for the
prefill stage cannot be reused. For this reason, ACL graph
execution is disabled by default during prefill.
2025-09-11 15:59:37 +08:00
..
CMakeLists.txt CANN: add support for ACL Graph (#15065) 2025-08-06 14:12:42 +08:00
Doxyfile CANN: Add the basic supports of Flash Attention kernel (#13627) 2025-05-26 10:20:18 +08:00
acl_tensor.cpp CANN: Implement GLU ops (#14884) 2025-07-26 17:56:18 +08:00
acl_tensor.h CANN: Add the basic supports of Flash Attention kernel (#13627) 2025-05-26 10:20:18 +08:00
aclnn_ops.cpp CANN: Add ROPE sin/cos cache for reuse (#15912) 2025-09-10 18:42:00 +08:00
aclnn_ops.h CANN: Add ggml_set_rows (#14943) 2025-07-29 22:36:43 +08:00
common.h CANN: Add ROPE sin/cos cache for reuse (#15912) 2025-09-10 18:42:00 +08:00
ggml-cann.cpp CANN: Disable acl_graph for prefill stage (#15933) 2025-09-11 15:59:37 +08:00