### Testing This document provides instructions for building and testing the GGML-VirtGPU backend on macOS with containers. #### Prerequisites The testing setup requires: - macOS host system - Container runtime with `libkrun` provider (podman machine) - Access to development patchset for VirglRenderer #### Required Patchsets The backend requires patches that are currently under review: - **Virglrenderer APIR upstream PR**: https://gitlab.freedesktop.org/virgl/virglrenderer/-/merge_requests/1590 (for reference) - **MacOS Virglrenderer for krunkit**: https://gitlab.freedesktop.org/kpouget/virglrenderer/-/tree/main-macos #### Build Instructions ##### 1. Build ggml-virtgpu-backend (Host-side, macOS) ```bash # Build the backend that runs natively on macOS mkdir llama.cpp cd llama.cpp git clone https://github.com/ggerganov/llama.cpp.git src cd src LLAMA_MAC_BUILD=$PWD/build/ggml-virtgpu-backend cmake -S . -B $LLAMA_MAC_BUILD \ -DGGML_NATIVE=OFF \ -DLLAMA_CURL=ON \ -DGGML_REMOTINGBACKEND=ONLY \ -DGGML_METAL=ON TARGETS="ggml-metal" cmake --build $LLAMA_MAC_BUILD --parallel 8 --target $TARGETS # Build additional tools for native benchmarking EXTRA_TARGETS="llama-run llama-bench" cmake --build $LLAMA_MAC_BUILD --parallel 8 --target $EXTRA_TARGETS ``` ##### 2. Build virglrenderer (Host-side, macOS) ```bash # Build virglrenderer with APIR support mkdir virglrenderer git clone https://gitlab.freedesktop.org/kpouget/virglrenderer -b main-macos src cd src VIRGL_BUILD_DIR=$PWD/build # -Dvenus=true and VIRGL_ROUTE_VENUS_TO_APIR=1 route the APIR requests via the Venus backend, for easier testing without a patched hypervisor meson setup $VIRGL_BUILD_DIR \ -Dvenus=true \ -Dapir=true ninja -C $VIRGL_BUILD_DIR ``` ##### 3. Build ggml-virtgpu (Guest-side, Linux) ```bash # Inside a Linux container mkdir llama.cpp git clone https://github.com/ggerganov/llama.cpp.git src cd src LLAMA_LINUX_BUILD=$PWD//build-virtgpu cmake -S . -B $LLAMA_LINUX_BUILD \ -DGGML_VIRTGPU=ON ninja -C $LLAMA_LINUX_BUILD ``` Option B: Build container image with frontend: ```bash cat << EOF > remoting.containerfile FROM quay.io/fedora/fedora:43 USER 0 WORKDIR /app/remoting ARG LLAMA_CPP_REPO="https://github.com/ggerganov/llama.cpp.git" ARG LLAMA_CPP_VERSION="master" ARG LLAMA_CPP_CMAKE_FLAGS="-DGGML_VIRTGPU=ON" ARG LLAMA_CPP_CMAKE_BUILD_FLAGS="--parallel 4" RUN dnf install -y git cmake gcc gcc-c++ libcurl-devel libdrm-devel RUN git clone "\${LLAMA_CPP_REPO}" src \\ && git -C src fetch origin \${LLAMA_CPP_VERSION} \\ && git -C src reset --hard FETCH_HEAD RUN mkdir -p build \\ && cd src \\ && set -o pipefail \\ && cmake -S . -B ../build \${LLAMA_CPP_CMAKE_FLAGS} \\ && cmake --build ../build/ \${LLAMA_CPP_CMAKE_BUILD_FLAGS} ENTRYPOINT ["/app/remoting/src/build/bin/llama-server"] EOF mkdir -p empty_dir podman build -f remoting.containerfile ./empty_dir -t localhost/remoting-frontend ``` #### Environment Setup ##### Set krunkit Environment Variables ```bash # Define the base directories (adapt these paths to your system) VIRGL_BUILD_DIR=$HOME/remoting/virglrenderer/build LLAMA_MAC_BUILD=$HOME/remoting/llama.cpp/build-backend # For krunkit to load the custom virglrenderer library export DYLD_LIBRARY_PATH=$VIRGL_BUILD_DIR/src # For Virglrenderer to load the ggml-remotingbackend library export VIRGL_APIR_BACKEND_LIBRARY="$LLAMA_MAC_BUILD/bin/libggml-virtgpu-backend.dylib" # For llama.cpp remotingbackend to load the ggml-metal backend export APIR_LLAMA_CPP_GGML_LIBRARY_PATH="$LLAMA_MAC_BUILD/bin/libggml-metal.dylib" export APIR_LLAMA_CPP_GGML_LIBRARY_REG=ggml_backend_metal_reg ``` ##### Launch Container Environment ```bash # Set container provider to libkrun export CONTAINERS_MACHINE_PROVIDER=libkrun podman machine start ``` ##### Verify Environment Confirm that krunkit is using the correct virglrenderer library: ```bash lsof -c krunkit | grep virglrenderer # Expected output: # krunkit 50574 user txt REG 1,14 2273912 10849442 ($VIRGL_BUILD_DIR/src)/libvirglrenderer.1.dylib ``` #### Running Tests ##### Launch Test Container ```bash # Optional model caching mkdir -p models PODMAN_CACHE_ARGS="-v models:/models --user root:root --cgroupns host --security-opt label=disable -w /models" podman run $PODMAN_CACHE_ARGS -it --rm --device /dev/dri localhost/remoting-frontend bash ``` ##### Test llama.cpp in Container ```bash # Run performance benchmark /app/remoting/build/bin/llama-bench -m ./llama3.2 ``` Expected output (performance may vary): ``` | model | size | params | backend | ngl | test | t/s | | ------------------------------ | ---------: | ---------: | ---------- | --: | ------------: | -------------------: | | llama 3B Q4_K - Medium | 1.87 GiB | 3.21 B | RemotingFrontend | 99 | pp512 | 991.30 ± 0.66 | | llama 3B Q4_K - Medium | 1.87 GiB | 3.21 B | RemotingFrontend | 99 | tg128 | 85.71 ± 0.11 | ``` #### Troubleshooting ##### SSH Environment Variable Issues ⚠️ **Warning**: Setting `DYLD_LIBRARY_PATH` from SSH doesn't work on macOS. Here is a workaround: **Workaround 1: Replace system library** ```bash VIRGL_BUILD_DIR=$HOME/remoting/virglrenderer/build # ⚠️ adapt to your system BREW_VIRGL_DIR=/opt/homebrew/Cellar/virglrenderer/0.10.4d/lib VIRGL_LIB=libvirglrenderer.1.dylib cd $BREW_VIRGL_DIR mv $VIRGL_LIB ${VIRGL_LIB}.orig ln -s $VIRGL_BUILD_DIR/src/$VIRGL_LIB ```