Backend Development 14 min read

Designing High‑Concurrency Backend Architecture for E‑commerce Platforms

The article explains how to design a scalable, highly available backend system capable of handling millions of requests per second by defining key performance metrics, estimating traffic with the 2/8 rule, and applying architectural patterns such as load‑balanced clusters, vertical service splitting, distributed caching, and database master‑slave replication, illustrated with a Taobao case study.

IT Architects Alliance

Apr 2, 2025

Designing High‑Concurrency Backend Architecture for E‑commerce Platforms

Key Metrics

Before building a system that must handle tens of millions of requests per second, it is essential to understand performance indicators such as QPS (queries per second), TPS (transactions per second), concurrent user count, and response time, which together guide architectural decisions.

(1) QPS, TPS, Concurrency and Response Time

QPS measures how many queries a system can answer each second, while TPS counts complete transactions (e.g., an order placement). Concurrency is the number of users simultaneously using the system, and response time is the latency perceived by the user. Higher QPS/TPS and concurrency increase resource demand, while lower response time improves user experience; balancing these factors is a core challenge.

(2) Estimating Traffic with the 2/8 Rule

Using the Pareto 2/8 rule, assume 20% of users are active daily. For a site with 10 million users, that yields 2 million active users. If each performs about 30 actions per visit, total daily page views reach 60 million. Concentrating 80% of these views in the most active 20% of the day (≈5 hours) gives an average of ~2 667 requests per second, and peak traffic can be 2–3 times higher, reaching roughly 8 000 RPS.

Architecture Design Steps

(1) From Monolith to High‑Availability Architecture

Initially, a single‑instance monolith is simple to develop and deploy, but it becomes a single point of failure as traffic grows. Introducing a cluster of application servers behind a load balancer distributes requests and provides redundancy. Adding a master‑slave database setup separates write and read paths, improving resilience and scalability.

(2) Vertical Service Splitting

To reduce code coupling and improve maintainability, split the monolith into independent services (e.g., user, product, order, payment). Each service can be developed, deployed, and scaled independently, much like separate shops in a mall, while communication between services occurs via well‑defined APIs.

(3) Distributed Cache for Read Pressure

High read traffic can overwhelm the database. Introducing a distributed cache such as a Redis cluster stores the hot 80% of data in memory, cutting database reads by over 80% and reducing response times dramatically.

(4) Database Master‑Slave Architecture and Read‑Write Separation

Writes are directed to the master database, while reads are served by multiple replica slaves. Replication is performed via binary log shipping, and a load balancer distributes read queries among the slaves, achieving 3–5× read throughput compared to a single instance.

(5) Additional Optimization Strategies

Other techniques include OS page caching, sequential disk I/O, SSD adoption, in‑memory caches, and pooling (connection pools, thread pools) to reduce resource creation overhead and further improve latency and throughput.

Case Study: Taobao

Taobao evolved from a monolithic architecture to a highly available, horizontally scaled system. It introduced load‑balanced application clusters, vertically split business domains, massive Redis caching, master‑slave databases with read‑write separation, SSD storage, and extensive memory‑level optimizations. These measures enabled the platform to handle the tens of millions of concurrent requests during major sales events such as Double 11, delivering a smooth shopping experience.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

e-commerce Backend Architecture load balancing high concurrency Distributed Cache database replication

Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.