Designing Scalable Large-Scale Internet Applications: Stateless Sessions, Caching, Service Splitting, Database Sharding, Asynchronous Communication, and Configuration Management
The article explains how to build a highly scalable internet application by adopting stateless session handling, effective caching, service decomposition with remote call frameworks, database sharding, asynchronous messaging, unstructured data storage, comprehensive monitoring, and unified configuration management.
Stateless design is essential for horizontal scalability; the article describes how Taobao's session framework stores state in client cookies using a multi‑value cookie approach, eliminating server‑side session storage and enabling easy node addition.
Effective caching, illustrated with Taobao's Tair, reduces database load by using both read and write caches; examples include caching shop pages and product details to lower DB pressure.
Service decomposition (HSF) separates a monolithic system into functional sub‑systems, improving maintainability, horizontal scaling, and fault isolation, while also introducing challenges such as inter‑service communication and dependency management.
Database splitting (TDDL) progresses from a single DB to master/slave replication, vertical partitioning (separate databases per domain), and finally horizontal sharding, addressing read/write pressure and large table size issues.
Asynchronous communication (Notify) decouples services via message middleware, enhancing system scalability, availability, and response time compared to synchronous calls.
Unstructured data storage combines a distributed file system (TFS) for small binary objects and NoSQL solutions (Cassandra, HBase, Bigtable) for key‑value data, employing BASE consistency to favor availability.
Monitoring and alerting systems track both coarse‑grained metrics (CPU, memory, traffic) and fine‑grained usage (page PV, bandwidth), enabling automatic alerts for abnormal conditions.
Unified configuration management ensures consistent settings across many nodes, simplifying node addition/removal and reducing configuration errors.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.