Cloud Computing 12 min read

High‑Performance Load Balancing Design and Implementation Using LVS and Tengine

This article reviews Alibaba Cloud's high‑performance load‑balancing solution, explaining the evolution from basic load‑balancing concepts to the architecture of LVS and Tengine, detailing their modes, optimizations, high‑availability designs across groups, AZs and regions, and outlining current use cases and future directions.

Architecture Digest

Oct 21, 2017

Load Balancing

Load balancing is a fundamental cloud‑computing component that distributes incoming traffic across multiple backend servers using various algorithms, supporting both global (DNS‑based) and intra‑cluster models, as well as hardware and software implementations such as F5, LVS, Nginx, and Haproxy.

LVS

LVS originally supports three modes—DR, TUN, and NAT—each with specific IP‑address handling and deployment constraints. It is built on the Linux Netfilter framework, which originally lacked strong multi‑core support.

Improvements include FullNAT (adding SNAT), parallel processing with RSS to bind flows to specific CPUs, fast‑path optimizations, instruction‑level enhancements, and NUMA‑aware memory locality, achieving up to 40 Mpps, 600 Kcps per node and linear scalability across many cores.

Tengine

Tengine handles layer‑7 traffic and faces performance challenges as CPU count grows; optimizations involve kernel‑level TCP stack tuning, the proprietary Alisocket (DPDK‑based) stack, hardware SSL offload, and web‑layer enhancements.

Elastic scaling is achieved by deploying Tengine instances in VMs, using health checks for failover, and supporting advanced features such as cookie‑based session persistence, URL routing, HTTP/2, and WebSocket, with a single VIP capable of 100 K HTTPS QPS.

High Availability

Group architecture provides full‑mesh redundancy with dual‑homed servers, multi‑region clusters, and automatic failover for servers, NICs, switches, and routing, delivering up to 640 Gbps aggregate throughput and seamless, user‑transparent upgrades.

AZ design duplicates routers across availability zones, enabling sub‑second failover without session sync, while Region design uses DNS‑based multi‑region VIPs and health‑checked LVS/Tengine instances to maintain service continuity across data centers.

Summary

The high‑performance load‑balancing solution powers public‑cloud front‑ends for e‑commerce, finance, and government, supports internal Alibaba Cloud services (RDS, OSS, DDoS protection), and serves as the traffic entry for platforms like Taobao and Alipay. Future work focuses on greater elasticity, higher single‑node capacity, proactive VIP probing, and end‑to‑end network monitoring.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Cloud Computing high availability load balancing LVS Tengine

Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.