Implement ggml_backend_cann_graph_optimize function for CANN backend, ported from Vulkan backend (PR #15489 and #15850). Key changes: - Add graph optimization to reorder nodes based on dependency analysis - Group non-dependent nodes together for potential parallel execution - Preserve fusion patterns (RMS_NORM+MUL, MUL_MAT+ADD, ADD+RMS_NORM) - Add GGML_CANN_DISABLE_GRAPH_OPTIMIZE env var to disable optimization This is the first step toward multi-stream parallel execution on Ascend NPU. |
||
|---|---|---|
| .. | ||
| cmake | ||
| include | ||
| src | ||
| .gitignore | ||
| CMakeLists.txt | ||