llama.cpp/ggml
hipudding 4951a4ff7a cann: optimize multi-stream execution
- Use parse_bool() for GGML_CANN_MULTI_STREAM environment variable
  parsing, consistent with other env var handling
- Only synchronize dependent streams instead of all streams when
  a node has multiple dependencies, reducing sync overhead
- Performance improvement: ~9% faster prompt processing on 0.5B model
  (1838 t/s vs 1688 t/s with ACL graph disabled)
2026-02-04 08:38:17 +00:00
..
cmake ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094) 2025-08-07 13:45:41 +02:00
include ggml-virtgpu: make the code thread safe (#19204) 2026-02-04 10:46:18 +08:00
src cann: optimize multi-stream execution 2026-02-04 08:38:17 +00:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt Bump cmake max version (needed for Windows on Snapdragon builds) (#19188) 2026-02-01 14:13:38 -08:00