Backend Development 16 min read

Designing Scalable Large-Scale Internet Applications: Stateless Sessions, Caching, Service Splitting, Database Sharding, Asynchronous Communication, and Configuration Management

The article explains how to build a highly scalable internet application by adopting stateless session handling, effective caching, service decomposition with remote call frameworks, database sharding, asynchronous messaging, unstructured data storage, comprehensive monitoring, and unified configuration management.

Architecture Digest
Architecture Digest
Architecture Digest
Designing Scalable Large-Scale Internet Applications: Stateless Sessions, Caching, Service Splitting, Database Sharding, Asynchronous Communication, and Configuration Management

Stateless design is essential for horizontal scalability; the article describes how Taobao's session framework stores state in client cookies using a multi‑value cookie approach, eliminating server‑side session storage and enabling easy node addition.

Effective caching, illustrated with Taobao's Tair, reduces database load by using both read and write caches; examples include caching shop pages and product details to lower DB pressure.

Service decomposition (HSF) separates a monolithic system into functional sub‑systems, improving maintainability, horizontal scaling, and fault isolation, while also introducing challenges such as inter‑service communication and dependency management.

Database splitting (TDDL) progresses from a single DB to master/slave replication, vertical partitioning (separate databases per domain), and finally horizontal sharding, addressing read/write pressure and large table size issues.

Asynchronous communication (Notify) decouples services via message middleware, enhancing system scalability, availability, and response time compared to synchronous calls.

Unstructured data storage combines a distributed file system (TFS) for small binary objects and NoSQL solutions (Cassandra, HBase, Bigtable) for key‑value data, employing BASE consistency to favor availability.

Monitoring and alerting systems track both coarse‑grained metrics (CPU, memory, traffic) and fine‑grained usage (page PV, bandwidth), enabling automatic alerts for abnormal conditions.

Unified configuration management ensures consistent settings across many nodes, simplifying node addition/removal and reducing configuration errors.

scalabilityConfiguration Managementcachingdatabase shardingsession managementAsynchronous Communicationservice splitting
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.