5.4 KiB
llama.cpp RPC with RDMA Transport
Background
The RPC backend in llama.cpp enables distributed inference by offloading computation to remote hosts over TCP. While TCP works well for most networks, the per-message overhead of the kernel network stack becomes a bottleneck for token generation, where each token requires a full round-trip between nodes.
RDMA (Remote Direct Memory Access) bypasses the kernel network stack entirely, allowing NICs to read and write memory directly. RoCEv2 (RDMA over Converged Ethernet v2) brings RDMA to standard Ethernet networks using commodity NICs from vendors like Mellanox/NVIDIA and Broadcom.
When built with -DGGML_RPC_RDMA=ON, the RPC backend auto-negotiates RDMA transport during connection setup. If both client and server have RDMA-capable NICs, the connection upgrades transparently. If either side lacks RDMA, it falls back to TCP silently.
OS
| OS | Status | Notes |
|---|---|---|
| Linux | Supported | Requires rdma-core / libibverbs-dev |
| Windows | N/A | RDMA code not compiled; TCP-only RPC works normally |
| macOS | N/A | RDMA code not compiled; TCP-only RPC works normally |
Hardware
RDMA transport requires RoCEv2-capable NICs on both nodes. Tested hardware:
| NIC | Link Speed | Status |
|---|---|---|
| Mellanox ConnectX-4 Lx (MT27710) 25GbE | 25 Gbps | Supported |
| Mellanox ConnectX-6 Lx (MT2894) 25GbE | 25 Gbps | Supported |
Other RoCEv2-capable NICs (ConnectX-5/7, Broadcom NetXtreme-E, etc.) should work but are untested. Mixed NIC generations across nodes are supported.
Performance
Two-node cluster: AMD Radeon 8060S (gfx1151) iGPUs, ConnectX-4 Lx / ConnectX-6 Lx 25GbE, RoCEv2. Model: Qwen3-Coder-Next 80B Q8_K_XL, layer split (-sm layer -ts 1/1) across both nodes, ROCm backend.
| Metric | TCP | RDMA | Improvement |
|---|---|---|---|
| Prompt processing (pp2048) | 651.48 | 678.42 | +4.1% |
| Token generation (tg256) | 30.19 | 32.16 | +6.5% |
Token generation benefits most because each token requires a round-trip between nodes. Results will vary with hardware, model size, and split configuration.
Building
Build with RDMA support by adding -DGGML_RPC_RDMA=ON:
cmake -B build \
-DCMAKE_BUILD_TYPE=Release \
-DGGML_RPC=ON \
-DGGML_RPC_RDMA=ON
cmake --build build --target rpc-server llama-bench -j$(nproc)
Dependencies
Requires libibverbs-dev (part of rdma-core):
# Ubuntu / Debian
sudo apt install libibverbs-dev rdma-core
# Fedora / RHEL
sudo dnf install libibverbs-devel rdma-core-devel
This is an optional dependency. Without -DGGML_RPC_RDMA=ON, the build produces a standard TCP-only binary with no RDMA code or libibverbs linkage.
Usage
Server
bin/rpc-server -H 0.0.0.0 -p 50052 -c
Client
bin/llama-bench --rpc 192.168.1.45:50052 \
-m model.gguf -p 2048 -n 256
The connection starts as TCP and upgrades to RDMA automatically during the handshake. The server log confirms the upgrade:
Accepted client connection
RDMA probed: dev=mlx5_0 gid=5 qpn=328 inline=316
RDMA activated: qpn=328->488 mtu=1024 rx_depth=24
When RDMA is not available (no hardware or connecting to a stock rpc-server), the connection works over TCP with no user action required.
Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
GGML_RDMA_DEV |
No | auto-detect | RDMA device name (e.g. mlx5_0). When set, only this device is considered. When unset, all devices are scanned for a GID matching the TCP socket's local IP. |
GGML_RDMA_GID |
No | auto-detect | GID table index. When unset, the first IPv4-mapped RoCEv2 GID is used. |
These variables are only needed when auto-detection fails, typically in complex network topologies such as Linux bridges where the IP may not appear in the expected GID slot.
Troubleshooting
Verify RDMA devices
ibv_devices # list RDMA devices
ibv_devinfo # show device details and port state
Check GID table
cat /sys/class/infiniband/mlx5_0/ports/1/gids/0
GID entries with fe80:: prefix are link-local (InfiniBand). Look for entries with ::ffff: prefix -- these are IPv4-mapped RoCEv2 GIDs.
RDMA not activating
- Ensure both nodes have
rdma-coreinstalled and RDMA devices visible inibv_devices - If using a Linux bridge, set
GGML_RDMA_DEVandGGML_RDMA_GIDexplicitly - Check that RoCEv2 is enabled on the NIC port
- Enable debug logging with
GGML_RPC_DEBUG=1to see probe/activate messages