Backend Development 10 min read

Optimizing High-Concurrency Services: Strategies for Handling Over 200k QPS

This article outlines practical techniques for optimizing high‑concurrency online services handling over 200 k QPS, covering the avoidance of relational databases, multi‑level caching, multithreading, circuit‑breaker and degradation strategies, I/O reduction, cautious retry policies, boundary checks, and efficient logging.

IT Architects Alliance

May 16, 2022

Optimizing High-Concurrency Services: Strategies for Handling Over 200k QPS

Preface: Optimizing high‑concurrency services (QPS >200k) presents challenges such as lack of offline caching, strict response time limits (<300 ms), and massive data volume (e.g., 5 GB per minute).

1. Say No to Relational Databases Large‑scale C‑end services should not rely on MySQL/Oracle as primary storage; instead use NoSQL caches like Redis or Memcached as the main “database”, with relational databases only as asynchronous backups.

Example: During JD.com’s Double‑11 event, product data is first written to Redis and later asynchronously persisted to MySQL; C‑end queries read from Redis, while B‑end queries may use the database.

2. Multi‑Level Caching While Redis offers 60‑80 k QPS per instance, horizontal scaling is limited by its single‑threaded nature and hotspot issues. Introducing a multi‑level cache (e.g., MemeryCache followed by local cache) can handle millions of QPS and mitigate cache penetration and breakdown.

3. Multithreading Replacing synchronous loops that read Redis (≈3 ms per call) with a thread‑pool reduces processing time dramatically (e.g., from 30 s to 3 s for a 300‑400 k list). Proper thread‑pool sizing and monitoring are essential to avoid resource waste.

4. Degradation and Circuit Breaker These self‑protection mechanisms prevent overload: degradation disables non‑essential features, while circuit breakers stop calls to overloaded downstream services, routing requests to fallback paths.

5. I/O Optimization Reducing the number of external calls (e.g., batching requests) prevents exponential I/O growth under massive traffic, avoiding bottlenecks and latency spikes.

6. Cautious Retry Retry should be limited in count, spaced appropriately, and configurable; excessive retries can cause cascading failures, as seen in a Kafka consumer lag incident.

7. Boundary Checks and Fallbacks Simple oversights like missing empty‑array checks can lead to massive data leaks; thorough validation prevents catastrophic incidents.

8. Graceful Logging Unrestricted logging at high QPS can consume terabytes of disk space and increase I/O latency. Implement rate‑limited logging (e.g., token‑bucket) and whitelist‑based logging to mitigate impact.

Conclusion: The article provides a checklist of essential practices for building resilient, high‑throughput backend services, encouraging continuous learning and careful engineering.

caching high concurrency Multithreading backend performance circuit-breaker io-optimization

Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.