llama.cpp/ggml
Matthew Michel 9de9672adb
sycl: use async memory allocation to fix crashes during graph recording (#16644)
* sycl: use async memory allocation to fix graph recording failures

GGML_SYCL_DISABLE_GRAPHS=0 causes crashes because:
  - Host waits are currently unsupported in graph recording mode.
  - SYCL malloc / free calls are unsupported in graph recording mode.

The following changes are made to fix SYCL graph functionality:
  - When graphs are enabled, use the SYCL async memory extension for temp
    buffers which is supported with SYCL graphs.
  - For compiler versions that do not support this extension, skip
    graphs with the affected op.
  - Switch from USM shared to device memory as the async extension
    currently just supports device allocations.

* Address reviewer feedback

* Use global async variable to decide path in sycl_ext_[malloc_device|free]
2025-10-23 09:05:15 +08:00
..
cmake ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094) 2025-08-07 13:45:41 +02:00
include Add experimental ggml-hexagon backend for the Hexagon NPU (#16547) 2025-10-22 13:47:09 -07:00
src sycl: use async memory allocation to fix crashes during graph recording (#16644) 2025-10-23 09:05:15 +08:00
.gitignore vulkan : cmake integration (#8119) 2024-07-13 18:12:39 +02:00
CMakeLists.txt Add experimental ggml-hexagon backend for the Hexagon NPU (#16547) 2025-10-22 13:47:09 -07:00