llama.cpp

History

Chenguang Li c1c354e44c CANN: Refactor ND to NZ workspace to be per-device (#15763 ) * CANN:Refactor ND to NZ workspace to be per-device in Ascend backend - Replaced the previous single global ND→NZ workspace with a per-device cache using unordered_map keyed by device ID. - Functions `release_nz_workspace`, `relloc_nz_workspace`, and `get_nz_workspace` now manage workspace independently for each device, preventing memory conflicts in multi-device / pipeline parallel scenarios. - This change fixes potential precision issues caused by workspace overwrites when multiple devices perform ND→NZ conversions concurrently. Co-authored-by: hipudding <huafengchun@gmail.com> * refactor Signed-off-by: noemotiovon <757486878@qq.com> * rename Signed-off-by: noemotiovon <757486878@qq.com> * fix review comments Signed-off-by: noemotiovon <757486878@qq.com> --------- Signed-off-by: noemotiovon <757486878@qq.com> Co-authored-by: hipudding <huafengchun@gmail.com>		2025-09-04 20:20:14 +08:00
..
cmake	ggml: Skip backend library linking code when GGML_BACKEND_DL=ON (#15094 )	2025-08-07 13:45:41 +02:00
include	ggml: add ops for WAN video model (cuda && cpu) (#15669 )	2025-09-04 10:38:49 +02:00
src	CANN: Refactor ND to NZ workspace to be per-device (#15763 )	2025-09-04 20:20:14 +08:00
.gitignore	vulkan : cmake integration (#8119 )	2024-07-13 18:12:39 +02:00
CMakeLists.txt	ggml-cpu : optimize RVV kernels (#15720 )	2025-09-03 16:16:21 +08:00