llama.cpp/skills
hipudding c1792d58b5 ggml: cann: add graph_optimize for multi-stream parallel preparation
Implement ggml_backend_cann_graph_optimize function for CANN backend,
ported from Vulkan backend (PR #15489 and #15850).

Key changes:
- Add graph optimization to reorder nodes based on dependency analysis
- Group non-dependent nodes together for potential parallel execution
- Preserve fusion patterns (RMS_NORM+MUL, MUL_MAT+ADD, ADD+RMS_NORM)
- Add GGML_CANN_DISABLE_GRAPH_OPTIMIZE env var to disable optimization

This is the first step toward multi-stream parallel execution on Ascend NPU.
2026-02-04 08:37:57 +00:00
..
cann_multi_stream_implementation.md ggml: cann: add graph_optimize for multi-stream parallel preparation 2026-02-04 08:37:57 +00:00