Backend Development 21 min read

Designing High‑Concurrency Architecture for E‑commerce Applications

This article explains how to design and evolve server architectures, load‑balancing, database clustering, caching, message queues, and other techniques to handle high‑concurrency scenarios such as flash sales and timed red‑packet distribution in large‑scale e‑commerce systems.

Architecture Digest

Dec 6, 2021

Designing High‑Concurrency Architecture for E‑commerce Applications

Server Architecture

High concurrency often occurs in business scenarios with a large number of active users gathering at the same time, such as flash‑sale events and timed red‑packet collection.

To ensure smooth operation and a good user experience, we must estimate the expected concurrency and design a suitable high‑concurrency handling solution.

Based on years of e‑commerce development experience, the author shares a summary of pitfalls and solutions as a personal archive and for the community.

Server Architecture

As a business matures, its server architecture evolves from a single server to a cluster and finally to distributed services.

A high‑concurrency service requires a solid architecture: load balancing, master‑slave database clusters, NoSQL cache clusters, and static file handling.

Typical components include:

Servers

Load balancer (e.g., Nginx, Alibaba Cloud SLB)

Resource monitoring

Distributed deployment

Databases

Master‑slave separation, clustering

DBA table and index optimization

Distributed deployment

NoSQL

Master‑slave clustering

Redis, MongoDB, Memcached

CDN

HTML, CSS, JS, images

Concurrency Testing

High‑concurrency services need thorough load testing to evaluate the maximum supported traffic.

Testing can be performed using third‑party services or self‑hosted test servers with tools such as Apache JMeter, Visual Studio Load Test, or Microsoft Web Application Stress Tool.

Practical Solutions

General Scenario

Daily traffic is large but dispersed; occasional spikes occur when users gather.

Typical use cases: user sign‑in, user center, order queries.

Architecture diagram:

Explanation:

These operations are frequent but mostly read‑heavy; therefore we prioritize cache reads and fall back to the database only when the cache misses, caching the result afterwards.

Cache keys are generated by hashing the user ID, distributing users across multiple cache shards to keep each shard size manageable.

User sign‑in and points acquisition

Compute user‑specific key and check Redis hash for today’s sign‑in info.

If found, return the info.

If not, query the DB; if a record exists, sync it to Redis.

If no DB record, create a new sign‑in entry and points in a transaction, then cache it.

Return the cached info.

Beware of duplicate sign‑ins under concurrency.

User orders

Cache only the first page (40 items) of order data.

Read from cache for page 1, otherwise query the DB.

Compute user‑specific key, check Redis, return if present; otherwise query DB, cache, and return.

User center

Check Redis hash for user info; if missing, query DB, cache, and return.

Other business

For shared cache data, avoid massive DB hits by updating cache via admin tools or locking DB writes.

Reference: "Advanced Redis" blog for cache‑update strategies.

As traffic grows, the architecture evolves to service‑oriented designs with independent services, each with its own load balancer, database cluster, and NoSQL cache cluster (e.g., user service, order service).

Message Queue

Flash‑sale or timed‑red‑packet activities cause a sudden influx of requests.

Scenario: timed red‑packet collection.

Architecture diagram:

Explanation:

During a timed event, a massive number of users hit the DB simultaneously, risking overload. For write‑heavy operations, the generic cache‑first approach is insufficient.

Solution: push user participation data into a Redis list, then have a multithreaded consumer process the queue and issue red‑packets, reducing direct DB load.

First‑Level Cache

When connection requests to the cache server exceed its capacity, some users may experience connection timeouts.

Solution: use a first‑level cache on the application server for the hottest data, with short TTLs, to offload traffic from the central NoSQL cache.

Example: cache front‑page product data that changes infrequently.

Architecture diagram:

Static Data

If data rarely changes, serve it as static JSON/XML/HTML files via CDN; only fall back to cache or DB when the CDN misses.

Other Strategies

Client‑side caching with version tags: send version number with requests, return 304 if unchanged.

Layering, Partitioning, Distribution

Large websites need long‑term planning: layer the system, split core business into modules, and deploy them distributedly.

Layering: separate application, service, and data layers.

Partitioning: break complex domains into smaller, cohesive modules (e.g., user account, order, coupon).

Distribution: deploy each module on independent servers with load balancers, DB clusters, and cache clusters.

Cluster

For high‑traffic services, deploy multiple identical application servers behind a load balancer; add machines to the cluster as traffic grows. Clustering also provides failover.

Application server cluster: Nginx reverse proxy, SLB, etc.

Database cluster: master‑slave replication.

Asynchronous Processing

High‑concurrency write operations put pressure on the primary DB. To reduce this pressure, handle writes asynchronously via message queues.

Workflow: client sends request → server quickly acknowledges → request data is enqueued → a separate worker dequeues and persists to DB, updating caches as needed.

Redundancy and Automation

When a server fails, standby servers should take over automatically. Automation can monitor resource usage, trigger alerts, and perform failover or scaling without manual intervention.

Redundancy: DB backups, standby servers.

Automation: monitoring, alerting, auto‑scaling, auto‑degradation.

Summary

High‑concurrency architecture is an evolving process; solid foundations enable future expansion and scalability.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Distributed Systems Scalability Caching high concurrency Server Architecture

Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.