Cloud Native 8 min read

Automating RDMA and SR-IOV Configuration in Kubernetes with sriov-network-operator and Kube-OVN

This article explains how the integration of sriov-network-operator and Kube-OVN automates the complex configuration and persistence of RDMA and SR‑IOV in Kubernetes, enabling high‑availability, multi‑tenant networking for AI distributed training workloads.

Cloud Native Technology Community
Cloud Native Technology Community
Cloud Native Technology Community
Automating RDMA and SR-IOV Configuration in Kubernetes with sriov-network-operator and Kube-OVN

In AI distributed training scenarios, using Remote Direct Memory Access (RDMA) to accelerate inter‑task network data reads has become the preferred performance‑optimization method. RDMA capabilities are provided by smart NICs and, in Kubernetes, require virtualization of the NIC via SR‑IOV or MacVlan so that each Pod can use a virtual function (VF) for RDMA.

The collaboration between Inspur Cloud Sea and the Kube‑OVN community identified pain points in SR‑IOV configuration and introduced the sriov-network-operator project, optimizing it to achieve automated NIC RDMA configuration and delivering a complete production‑grade RDMA solution with Kube‑OVN.

01 Challenges of RDMA and SR‑IOV configuration

Configuring RDMA and SR‑IOV involves many parameters and varies across NIC vendors. Issues include the complexity of initializing maximum and desired VF counts, MTU, VLAN, IOMMU settings, and loading vendor‑specific kernel modules (e.g., ice, iavf, irdma for Intel; OFED for Mellanox). Additionally, VF persistence after node reboot and the need to manually restart the device‑plugin for Kubernetes to recognize new VFs make management cumbersome.

02 Automated configuration with sriov-network-operator

Inspur Cloud Sea adopted the sriov-network-operator to address these problems. Declarative configuration enables dynamic, automated, and highly‑available SR‑IOV setup, reducing manual effort and improving flexibility, making it suitable for cloud‑native environments.

The operator provides a global SR‑IOV template that stores desired settings (NIC name, type, VF count, node labels) as Kubernetes resources in etcd, and a node‑specific template derived from the global one. A distributed SR‑IOV configurator runs as a daemon on each node, performing pre‑setup (enabling IOMMU, loading VFIO_PCI), listening for resource changes, generating and executing configuration scripts, handling pod eviction and node reboot, and updating device‑plugin metadata.

Enhancements include support for Kube‑OVN OVS offload, automatic scheduling of pods to nodes with feature.node.kubernetes.io/network‑SR‑IOV.capable=true , Intel iavf module loading, and forced pod eviction to avoid long‑lasting node unavailability.

03 Kube‑OVN + SR‑IOV solution for coexistence of RDMA and standard Kubernetes networking

The solution deploys separate NICs for RDMA and standard traffic, using Kube‑OVN as a global IPAM to simplify IP address management, achieve multi‑tenant isolation, and provide a unified networking experience. This architecture supports large‑scale AI compute environments, enhancing performance, security, and scalability.

Reference links: https://github.com/kubeovn/kube-ovn , https://github.com/kubeovn/SR-IOV-network-operator .

cloud nativeKubernetesnetworkingRDMASR-IOVKube-OVNsriov-network-operator
Cloud Native Technology Community
Written by

Cloud Native Technology Community

The Cloud Native Technology Community, part of the CNBPA Cloud Native Technology Practice Alliance, focuses on evangelizing cutting‑edge cloud‑native technologies and practical implementations. It shares in‑depth content, case studies, and event/meetup information on containers, Kubernetes, DevOps, Service Mesh, and other cloud‑native tech, along with updates from the CNBPA alliance.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.