Fundamentals 11 min read

System Capacity Design and Evaluation: Concepts, Metrics, and Practical Steps

This article explains how to design and evaluate system capacity by defining key metrics such as QPS, TPS and concurrency, describing when capacity assessment is needed, and outlining a step‑by‑step methodology—including traffic analysis, peak estimation, stress testing and redundancy planning—to ensure reliable performance under varying load conditions.

Architecture Digest
Architecture Digest
Architecture Digest
System Capacity Design and Evaluation: Concepts, Metrics, and Practical Steps

Background – A real‑world example from a company’s annual sports meet illustrates the importance of capacity design: a 2000 m race with limited slots required careful scheduling, and unexpected increases in participants caused delays and complaints, highlighting the need for proactive capacity planning.

Concept – Design capacity is the process of estimating system resources (data volume, concurrency, bandwidth, user counts, etc.) using strategic analysis; it is a core skill for architects.

Key Metrics

• TPS (Transactions Per Second) – number of transactions processed each second.

• QPS (Queries Per Second) – a common throughput metric indicating requests handled per second.

• Concurrency – the number of simultaneous requests the system can handle.

• Peak QPS calculation: (Total PV × 80%) / (Seconds per day × 20%).

• Relationship: QPS = Concurrency / Average Response Time; Concurrency = QPS × Average Response Time.

When to Assess Capacity

1. Temporary traffic spikes (e.g., promotional events like 618, Double 11).

2. Initial system launch – estimating baseline load.

3. Changes in system baseline (new features, growing data, higher DAU) requiring re‑evaluation.

Evaluation Steps (using concurrency as an example)

1. Analyze daily total visits (PV/UV) from product, operations, or historical data.

2. Estimate average QPS: total daily visits divided by active seconds.

3. Estimate peak QPS using traffic curves or the 80/20 rule.

4. Conduct performance stress testing (e.g., nGrinder, JMeter) to find the maximum QPS a single instance can sustain; a response time > 2 s indicates a bottleneck.

5. Determine required number of instances based on peak QPS and per‑instance capacity, adding redundancy as needed.

Case Study – Book Reservation System

Using the 80/20 rule, a total PV of 1,500,000 over 9 hours yields a peak QPS of ~185 req/s; with an average response time of 0.5 s, concurrency ≈ 92.5, rounded to 100, and after applying a pessimism factor, a target of 200 concurrent users is set for testing.

Summary

The article reiterates the three timing scenarios for capacity design, the five‑step evaluation process (daily traffic analysis, average QPS, peak QPS, stress testing, redundancy adjustment), and emphasizes that early capacity estimation—like adjusting the sports‑event schedule—prevents operational issues.

Concurrencysystem designPerformance TestingCapacity PlanningQPS
Architecture Digest
Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.