Cloud Computing 12 min read

High‑Performance Load Balancing Design and Implementation Using LVS and Tengine

This article reviews Alibaba Cloud's high‑performance load‑balancing solution, explaining the evolution from basic load‑balancing concepts to the architecture of LVS and Tengine, detailing their modes, optimizations, high‑availability designs across groups, AZs and regions, and outlining current use cases and future directions.

Architecture Digest
Architecture Digest
Architecture Digest
High‑Performance Load Balancing Design and Implementation Using LVS and Tengine

Load Balancing

Load balancing is a fundamental cloud‑computing component that distributes incoming traffic across multiple backend servers using various algorithms, supporting both global (DNS‑based) and intra‑cluster models, as well as hardware and software implementations such as F5, LVS, Nginx, and Haproxy.

LVS

LVS originally supports three modes—DR, TUN, and NAT—each with specific IP‑address handling and deployment constraints. It is built on the Linux Netfilter framework, which originally lacked strong multi‑core support.

Improvements include FullNAT (adding SNAT), parallel processing with RSS to bind flows to specific CPUs, fast‑path optimizations, instruction‑level enhancements, and NUMA‑aware memory locality, achieving up to 40 Mpps, 600 Kcps per node and linear scalability across many cores.

Tengine

Tengine handles layer‑7 traffic and faces performance challenges as CPU count grows; optimizations involve kernel‑level TCP stack tuning, the proprietary Alisocket (DPDK‑based) stack, hardware SSL offload, and web‑layer enhancements.

Elastic scaling is achieved by deploying Tengine instances in VMs, using health checks for failover, and supporting advanced features such as cookie‑based session persistence, URL routing, HTTP/2, and WebSocket, with a single VIP capable of 100 K HTTPS QPS.

High Availability

Group architecture provides full‑mesh redundancy with dual‑homed servers, multi‑region clusters, and automatic failover for servers, NICs, switches, and routing, delivering up to 640 Gbps aggregate throughput and seamless, user‑transparent upgrades.

AZ design duplicates routers across availability zones, enabling sub‑second failover without session sync, while Region design uses DNS‑based multi‑region VIPs and health‑checked LVS/Tengine instances to maintain service continuity across data centers.

Summary

The high‑performance load‑balancing solution powers public‑cloud front‑ends for e‑commerce, finance, and government, supports internal Alibaba Cloud services (RDS, OSS, DDoS protection), and serves as the traffic entry for platforms like Taobao and Alipay. Future work focuses on greater elasticity, higher single‑node capacity, proactive VIP probing, and end‑to‑end network monitoring.

network architecturehigh availabilityload balancingLVSTengine
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.