Implement ggml_backend_cann_graph_optimize function for CANN backend, ported from Vulkan backend (PR #15489 and #15850). Key changes: - Add graph optimization to reorder nodes based on dependency analysis - Group non-dependent nodes together for potential parallel execution - Preserve fusion patterns (RMS_NORM+MUL, MUL_MAT+ADD, ADD+RMS_NORM) - Add GGML_CANN_DISABLE_GRAPH_OPTIMIZE env var to disable optimization This is the first step toward multi-stream parallel execution on Ascend NPU. |
||
|---|---|---|
| .. | ||
| cann_multi_stream_implementation.md | ||