Operations 14 min read

ContainerLB: A High‑Performance, High‑Reliability Distributed Load Balancer Based on Intel DPDK for JD.com

The article presents ContainerLB, a software‑defined L4 load‑balancing service built on Intel DPDK and X86 servers for JD.com, detailing its architecture, FULLNAT mode, high‑availability design, performance optimizations, and test results that demonstrate near‑line‑rate packet processing and scalable deployment in a container‑native data‑center.

JD Tech
JD Tech
JD Tech
ContainerLB: A High‑Performance, High‑Reliability Distributed Load Balancer Based on Intel DPDK for JD.com

With JD.com’s rapid business growth, the demand for a highly reliable and high‑performance load‑balancing solution has become critical. ContainerLB is a distributed L4 load balancer implemented on generic X86 servers using the Intel DPDK packet‑forwarding library, integrated with OSPF/BGP routing to provide a low‑cost, scalable, and intelligent traffic distribution platform for JD’s data‑center.

The system operates in FULLNAT mode, rewriting both source and destination addresses so that all traffic passes through the load balancer. It supports dynamic VIP publishing via SKYDNS, and leverages OSPF/BGP for multi‑path routing, ensuring N+1 redundancy and seamless failover.

Key design goals include high reliability, high performance, ease of maintenance, and cost‑effectiveness. ContainerLB runs on standard X86_64 servers equipped with DPDK‑enabled NICs, allowing fast packet processing close to line‑rate. The architecture separates a control core (handling configuration, ARP, etc.) from multiple worker cores that each poll a dedicated NIC RX queue, using RSS and FD‑IR for flow steering so that a client’s request and its response are processed on the same core.

DPDK features such as multi‑core programming, CPU affinity, hugepages, lock‑free queues, and poll‑mode drivers are exploited to minimize latency and maximize throughput. Session management uses a five‑tuple table per worker core, avoiding lock contention.

Performance optimizations include prefetching with rte_prefetch0() , branch prediction hints using likely() and unlikely() , and minimizing lock usage by employing a single DPDK read‑write lock for configuration updates.

Benchmarking on an Intel Xeon E5‑2640 v3 with 10‑GbE NICs shows that ContainerLB achieves near‑line‑rate UDP forwarding and high HTTP throughput when paired with NGINX backends, while consuming 8 CPU cores and 4 GB of memory.

In summary, ContainerLB provides a fast, flexible, and highly available L4 load‑balancing solution that integrates tightly with JDOS (JD’s software‑defined data‑center), supporting various scheduling algorithms (consistent hash, round‑robin, least‑connections) and meeting the demanding traffic patterns of large‑scale e‑commerce services.

load balancingNetworkcontainerHigh PerformanceDPDKJDOSFULLNAT
JD Tech
Written by

JD Tech

Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.