RDMA
How to use cross-node RDMA (Remote Direct Memory Access) inside VKS
Overview
RDMA stands for Remote Direct Memory Access — a high-performance networking technique. It lets one machine directly read or write the memory of another without involving the remote CPU, OS interrupts, or kernel. The result is a dramatic drop in network latency and CPU load, ideal for low-latency, high-throughput workloads.
Key properties
- Low latency — bypasses the OS kernel, shortening the data path.
- High bandwidth — fully utilizes the underlying hardware bandwidth.
- Low CPU usage — data transfer skips the CPU, freeing it for other tasks.
- Zero copy — data goes directly from one application's buffer to another's, with no intermediate copies.
Implementations
There are three mainstream RDMA implementations:
-
InfiniBand (IB)
- Network protocol designed for HPC.
- Extremely low latency and high bandwidth.
- Requires dedicated hardware (switches, NICs).
-
RoCE (RDMA over Converged Ethernet)
- RDMA over standard Ethernet.
- Reuses Ethernet infrastructure but requires DCB (Data Center Bridging) capable switches.
- RoCEv1 (single L2 segment only) and RoCEv2 (with L3 routing).
-
iWARP (Internet Wide Area RDMA Protocol)
- RDMA over TCP/IP.
- Runs on standard IP networks but performance can lag InfiniBand or RoCE.
Using RDMA in VKS
VKS supports cross-node RDMA. Just add the RDMA device label(s) to your container's resource definition. Currently supported:
rdma/rdma_shared_device_a: 1rdma/rdma_shared_device_b: 1
Example
apiVersion: ray.io/v1
kind: RayCluster
metadata:
name: raycluster-kuberay
spec:
rayVersion: '2.40.0'
headGroupSpec:
rayStartParams: {}
template:
spec:
containers:
- name: ray-head
image: registry.hd-01.alayanew.com:8443/vc-app_market/ray-ml-vllm:0.7.1
resources:
requests:
memory: "1600G"
cpu: "144"
nvidia.com/gpu-h800: 8
rdma/rdma_shared_device_a: 1
rdma/rdma_shared_device_b: 1
limits:
memory: "1600G"
cpu: "144"
nvidia.com/gpu-h800: 8
rdma/rdma_shared_device_a: 1
rdma/rdma_shared_device_b: 1
workerGroupSpecs:
- replicas: {{ .Values.raycluster.workerGroupSpecs.replicas }}
groupName: workergroup
rayStartParams: {}
template:
spec:
containers:
- name: ray-worker
image: registry.hd-01.alayanew.com:8443/vc-app_market/ray-ml-vllm:0.7.1
resources:
requests:
memory: "1600G"
cpu: "144"
nvidia.com/gpu-h800: 8
rdma/rdma_shared_device_a: 1
rdma/rdma_shared_device_b: 1
limits:
memory: "1600G"
cpu: "144"
nvidia.com/gpu-h800: 8
rdma/rdma_shared_device_a: 1
rdma/rdma_shared_device_b: 1Last updated on
Was this page helpful?
