Overlay Networking and Inter‑Node Traffic Forwarding in Virtualized Cloud Environments
This article explains how virtual machine traffic is forwarded across compute nodes in an IAAS cloud using overlay protocols such as VXLAN, detailing L2/L3 inter‑node forwarding, control‑plane designs, and forwarding‑plane implementations with Linux‑Kernel OVS, DPDK, NIC offload, and VDPA.
1. Overlay
In IAAS cloud scenarios, the virtual machine network is built on a combination of a compute‑node vSwitch and physical VLANs, but due to VLAN limits and migration requirements a virtual overlay network is added on top of the underlay. Common overlay protocols include GRE, Geneve, and VXLAN, with VXLAN being the most widely supported and therefore used as the default in this article.
Overlay traffic is invisible to the underlay; only the outer headers (e.g., the compute‑node MAC/IP) are visible, making traffic analysis or mirroring on the physical network difficult without costly decapsulation.
2. Inter‑Node L2 Forwarding
Node‑to‑node L2 forwarding must distinguish traffic that does not belong to the local node, perform ARP proxy, FDB lookup, VNI mapping, and then encapsulate the original frame with a VXLAN header before sending it over the underlay.
When a VM sends an ARP request, the vSwitch replies based on VLAN tag and target IP, writes the packet to the virtqueue, and the virtio backend forwards it to the vSwitch. If the local L2 table does not have a matching MAC, the packet is pushed to the outer FDB, VXLAN‑encapsulated, and sent to the remote compute node, where it is decapsulated, VLAN‑tagged, and forwarded to the appropriate local port.
3. Inter‑Node L3 Forwarding
Node‑to‑node L3 forwarding adds routing logic on the source node, marking packets with VLAN tags, checking whether the destination MAC is the gateway, and building a three‑layer routing table. After routing, the packet is VXLAN‑encapsulated and sent to the remote node, where it is decapsulated, VLAN‑tagged, and delivered via the L2 forwarding path.
The result is that cross‑subnet traffic is transformed into L2 frames after passing through the gateway MAC, differing from intra‑subnet traffic which is ARP‑resolved directly by the VM.
4. Intra‑ and Internet‑Facing Communication
Communication between the virtual network and external networks (e.g., SNAT or floating IP) requires the VM’s overlay identity to be mapped to a unique underlay address so that physical routers can reach the VM.
In SNAT mode, traffic is encapsulated, sent to a centralized gateway, source IP is replaced with an underlay IP, and the reply follows the reverse path. Floating IP maps a public IP to the VM’s virtual IP, allowing inbound traffic from the Internet.
5. Control Plane
The control plane manages virtual network resources such as subnets, IP pools, routing policies, security groups, and QoS, translating high‑level intents into forwarding rules for the data plane.
5.1 Centralized Management
Typical implementations (e.g., OpenStack Neutron) use a server‑agent model with RPC over RabbitMQ. Changes are broadcast to all compute nodes, which update their FDB tables accordingly, though asynchronous broadcasts can cause temporary inconsistency.
5.2 EVPN Self‑Balancing
EVPN runs BGP full‑mesh between compute nodes to synchronize VNI and MAC information, automatically balancing virtual network elements. Large clusters may use Route Reflectors to reduce BGP sessions, but this adds configuration complexity.
5.3 Control‑Forwarding Co‑operation
Traditional “planned forwarding” pushes all decisions to the control plane, which can be inflexible. A co‑operative model lets the forwarding plane dynamically adjust paths and feed back to the control plane, reducing latency and improving scalability.
6. Forwarding Plane
6.1 Linux Kernel + OVS
OVS consists of ovsdb, vswitchd, and a kernel datapath. The datapath stores match/action rules supplied by vswitchd via netlink. VXLAN encapsulation/decapsulation is performed by the kernel VXLAN module.
6.2 DPDK
DPDK moves packet I/O to user space, eliminating kernel‑mode overhead. OVS runs in user space, polls NICs directly, and communicates with VMs via vhost‑user shared memory, achieving higher throughput.
6.3 NIC Offload
Modern NICs can offload VXLAN processing, match/action tables, and QoS. By switching the NIC to eswitch mode and using SR‑IOV virtual functions (VFs), traffic can bypass the software vSwitch entirely, though this introduces VM‑visible hardware dependencies and migration challenges.
6.4 VDPA
VDPA implements virtio over a VF, allowing the VM to keep using virtio while the NIC handles data‑plane work. The control path is mediated by either a kernel VDPA module or a DPDK‑based VDPA, preserving compatibility and enabling high‑performance forwarding.
The article focuses on how VM traffic is forwarded between nodes, covering both the SDN control plane and the forwarding plane; subsequent sections will discuss security and quality aspects.
360 Smart Cloud
Official service account of 360 Smart Cloud, dedicated to building a high-quality, secure, highly available, convenient, and stable one‑stop cloud service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.