llama.cpp/ggml/src
Oleksandr Kuvshynov 88d23ad515
vulkan: handle device dedup on MacOS + Vega II Duo cards (#19058)
Deduplication here relied on the fact that vulkan would return unique
UUID for different physical GPUs. It is at the moment not always the case.
On Mac Pro 2019 running Mac OS, with 2 Vega II Duo cards (so, 4 GPU total),
MotlenVK would assign same UUID to pairs of GPUs, unless they
are connected with Infinity Fabric.

See more details here: KhronosGroup/MoltenVK#2683.

The right way is to fix that in MoltenVK, but until it is fixed,
llama.cpp would only recognize 2 of 4 GPUs in such configuration.

The deduplication logic here is changed to only filter GPUs if UUID is
same but driver is different.
2026-01-28 12:35:54 +01:00
..
ggml-blas ggml : add ggml_build_forward_select (#18550) 2026-01-19 20:03:19 +02:00
ggml-cann ggml : add ggml_build_forward_select (#18550) 2026-01-19 20:03:19 +02:00
ggml-cpu ggml-cpu: arm64: Q4_K scale unroll and vectorization (#19108) 2026-01-28 09:15:56 +02:00
ggml-cuda cuda : fix "V is K view" check for non-unified KV cache (#19145) 2026-01-28 09:15:27 +02:00
ggml-hexagon ggml-hexagon: flash-attn opt (#19025) 2026-01-23 22:02:07 -08:00
ggml-hip HIP: fix AMDGPU_TARGETS, update documentation (#16803) 2025-10-27 21:39:49 +01:00
ggml-metal metal : fix recommendedMaxWorkingSetSize availability on legacy iOS/macOS (#19088) 2026-01-25 20:07:19 +02:00
ggml-musa CUDA: faster tile FA, add oob checks, more HSs (#16492) 2025-10-11 20:54:32 +02:00
ggml-opencl opencl: add flattened q6_K mv (#19054) 2026-01-26 19:36:24 -08:00
ggml-rpc rpc : use unordered_map::reserve and emplace (#18513) 2026-01-02 12:09:36 +02:00
ggml-sycl [SYCL] use malloc to support both iGPU and dGPU in same time (#18992) 2026-01-23 20:54:10 +08:00
ggml-virtgpu ggml: new backend for Virglrenderer API Remoting acceleration (v2) (#18718) 2026-01-28 17:49:40 +08:00
ggml-vulkan vulkan: handle device dedup on MacOS + Vega II Duo cards (#19058) 2026-01-28 12:35:54 +01:00
ggml-webgpu ggml webgpu: Split shared state (webgpu_context) into global state and per-thread state (#18976) 2026-01-27 20:53:36 -08:00
ggml-zdnn ggml-zdnn : mark zDNN buffers as non-host (#18967) 2026-01-22 01:16:21 +01:00
ggml-zendnn ggml-zendnn : update ZenDNN git tag to main branch (#19133) 2026-01-28 06:21:36 +08:00
CMakeLists.txt ggml: new backend for Virglrenderer API Remoting acceleration (v2) (#18718) 2026-01-28 17:49:40 +08:00
ggml-alloc.c llama: automatically set parameters not set by the user in such a way that maximizes GPU utilization (#16653) 2025-12-15 09:24:59 +01:00
ggml-backend-impl.h llama: use host memory if device reports 0 memory (#18587) 2026-01-09 05:34:56 +08:00
ggml-backend-reg.cpp ggml: new backend for Virglrenderer API Remoting acceleration (v2) (#18718) 2026-01-28 17:49:40 +08:00
ggml-backend.cpp ggml : add ggml_build_forward_select (#18550) 2026-01-19 20:03:19 +02:00
ggml-common.h llama : add gpt-oss (#15091) 2025-08-05 22:10:36 +03:00
ggml-impl.h ggml : add ggml_build_forward_select (#18550) 2026-01-19 20:03:19 +02:00
ggml-opt.cpp finetune: SGD optimizer, more CLI args (#13873) 2025-08-14 12:03:57 +02:00
ggml-quants.c ggml : fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl (#15928) 2025-09-23 10:25:20 +02:00
ggml-quants.h llama : add gpt-oss (#15091) 2025-08-05 22:10:36 +03:00
ggml-threading.cpp ggml : build backends as libraries (#10256) 2024-11-14 18:04:35 +01:00
ggml-threading.h remove CMAKE_WINDOWS_EXPORT_ALL_SYMBOLS (#10797) 2024-12-12 19:02:49 +01:00
ggml.c ggml : add ggml_build_forward_select (#18550) 2026-01-19 20:03:19 +02:00
ggml.cpp ggml : Print backtrace on uncaught C++ exceptions (ggml/1232) 2025-06-01 13:43:57 +03:00
gguf.cpp GGUF: check that tensor size is representable (#19072) 2026-01-24 21:57:51 +01:00