How to Build a Full‑Chain Load‑Testing Platform for E‑Commerce in 2 Days
This article details how Xiaohongshu tackled rapid growth challenges by designing, implementing, and operating a full‑link performance testing platform in just two days, covering system architecture, testing models, collaborative deployment, capacity planning, and practical advice for teams seeking reliable e‑commerce load testing.
1. Xiaohongshu: Past and Present
Founded in 2013, Xiaohongshu started as a community for users to share purchased products and experiences, quickly becoming the largest product‑sharing community in China. By 2014 it launched an e‑commerce platform, and today it is one of the largest cross‑border e‑commerce communities with multiple bonded warehouses nationwide.
2. Growing Pains
Rapid growth brought three main challenges:
Resource expansion lagged behind business growth; the stability team had only two‑three members compared to large enterprises.
The monolithic Python architecture became a bottleneck during major sales events.
Lack of effective performance and online stability assurance strategies.
3. Full‑Chain Load‑Testing Architecture
Large companies like Alibaba and JD have mature platforms, but Xiaohongshu needed a solution with limited manpower. The architecture consists of four components:
Link test script configuration management
Test scheduling
Unified test data management
Monitored status of the target business systems
The core of the system is the test scripts, which must be created quickly and conveniently for many business flows.
4. Building the Platform from 0 to 1
4.1 Starting from Scratch
For a major promotion on June 6, the testing team had only two members and one operations engineer, while developers were focused on business development until June 4. The testing system began as a blank slate.
4.2 Load‑Testing Model
To conduct full‑chain testing, a model is needed to define what to test, how to test, and the target performance level. The steps are:
Identify all business links by retrieving the online API list through monitoring systems.
Obtain the proportion relationships between links from monitoring data and historical traffic patterns.
Calculate the required request volume, scripts, and load‑generation machines based on the above information.
Each test script includes four parameters: target QPS, current water‑level, pressure duration, and the specific link to be tested.
4.3 Close Collaboration
The testing tool was packaged for easy deployment, allowing operations and testing teams to launch hundreds of load‑generation containers within five minutes, achieving over 5 Gbit/s of pressure. Test traffic is tagged to separate it from real user traffic, ensuring clean reporting and protecting production data.
5. Beyond Load Testing
Additional practices include capacity forecasting, degradation plans, emergency response, promotion rehearsals, and on‑call schedules. Historical traffic monitoring informs capacity estimates and provides baselines for rate‑limiting and circuit‑breaker settings. When traffic exceeds thresholds, functional degradation or emergency measures are applied.
6. Mid‑Year 66% Promotion Full‑Chain Practice
From project kickoff on May 6 to the first link pressure on May 8, the team built the platform from 0 to 1 in two days, demonstrating that with the right tools and methods, rapid delivery is achievable even with limited staff.
Key recommendations for teams starting full‑chain testing:
Do not aim for an overly comprehensive platform at the outset.
Focus on the system itself, starting with monitoring and rate‑limiting.
Master full‑chain testing techniques to quickly build a functional solution from scratch.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.