Databases 11 min read

Understanding 503 Errors and Database Bottlenecks: From Read‑Write Splitting to Caching, Vertical and Horizontal Sharding

The article explains that 503 errors under high concurrency are usually caused by database bottlenecks and outlines a systematic approach—including read‑write separation, caching with memcached, consistent hashing, vertical partitioning, and horizontal sharding—to improve website scalability and reliability.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Understanding 503 Errors and Database Bottlenecks: From Read‑Write Splitting to Caching, Vertical and Horizontal Sharding

In the previous part I described that a 503 error means the server is temporarily unavailable, often due to database problems under high concurrency; the database is the weakest link in a web system, analogous to the shortest board of a water‑filled barrel.

A database bottleneck occurs when the DB cannot respond quickly enough to a flood of requests, leading to lock contention or deadlocks, which ultimately cause the whole site to return 503 errors.

Storing session data in an external cache such as memcached sounds like a solution, but using only one or two cache nodes does not provide true redundancy; if one node fails, many users lose their sessions, showing that memcached‑style distributed caches differ from fully fault‑tolerant distributed systems.

Memcached distributes keys by hashing them to a specific server; when a server goes down, a large portion of cached data is lost. Consistent hashing was introduced to minimise the impact of node failures.

Consistent hashing ensures that when a server crashes, only a small fraction of the cache is affected.

Read‑write separation is a common design for sites with heavily skewed read/write ratios; separating reads from writes reduces complexity and cost, making it the first step to alleviate storage bottlenecks.

The benefit of read‑write separation stems from disk mechanics: sequential reads are fast, while writes require locking and can become a performance choke point under concurrency; moving writes to a dedicated node and keeping reads on another improves overall throughput.

Beyond read‑write separation, caching moves frequently accessed data into memory (orders of magnitude faster than disk), and search technologies complement databases by handling flexible, high‑volume random access patterns that traditional relational queries struggle with.

When a single database becomes a bottleneck, vertical partitioning (splitting large tables such as product or transaction tables into separate databases) reduces load on the primary instance, though it introduces complexity for transactional queries.

If vertical partitioning is insufficient, horizontal sharding distributes a single large table across multiple databases, further scaling the system.

The article concludes with a roadmap for tackling large‑scale website data bottlenecks: single‑instance DB → read‑write separation → caching → search → vertical partitioning → horizontal sharding.

These steps provide a logical progression for architects to diagnose and resolve performance issues in high‑traffic web applications.

cachingRead/Write SplittingVertical PartitioningMemcachedhorizontal sharding503 errordatabase bottleneck
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.