Databases 11 min read

Understanding 503 Errors and Database Bottlenecks: From Read‑Write Splitting to Caching, Vertical and Horizontal Sharding

The article explains that 503 errors under high concurrency are usually caused by database bottlenecks and outlines a systematic approach—including read‑write separation, caching with memcached, consistent hashing, vertical partitioning, and horizontal sharding—to improve website scalability and reliability.

Qunar Tech Salon

Feb 17, 2016

Understanding 503 Errors and Database Bottlenecks: From Read‑Write Splitting to Caching, Vertical and Horizontal Sharding

In the previous part I described that a 503 error means the server is temporarily unavailable, often due to database problems under high concurrency; the database is the weakest link in a web system, analogous to the shortest board of a water‑filled barrel.

A database bottleneck occurs when the DB cannot respond quickly enough to a flood of requests, leading to lock contention or deadlocks, which ultimately cause the whole site to return 503 errors.

Storing session data in an external cache such as memcached sounds like a solution, but using only one or two cache nodes does not provide true redundancy; if one node fails, many users lose their sessions, showing that memcached‑style distributed caches differ from fully fault‑tolerant distributed systems.

Memcached distributes keys by hashing them to a specific server; when a server goes down, a large portion of cached data is lost. Consistent hashing was introduced to minimise the impact of node failures.

Consistent hashing ensures that when a server crashes, only a small fraction of the cache is affected.

Read‑write separation is a common design for sites with heavily skewed read/write ratios; separating reads from writes reduces complexity and cost, making it the first step to alleviate storage bottlenecks.

The benefit of read‑write separation stems from disk mechanics: sequential reads are fast, while writes require locking and can become a performance choke point under concurrency; moving writes to a dedicated node and keeping reads on another improves overall throughput.

Beyond read‑write separation, caching moves frequently accessed data into memory (orders of magnitude faster than disk), and search technologies complement databases by handling flexible, high‑volume random access patterns that traditional relational queries struggle with.

When a single database becomes a bottleneck, vertical partitioning (splitting large tables such as product or transaction tables into separate databases) reduces load on the primary instance, though it introduces complexity for transactional queries.

If vertical partitioning is insufficient, horizontal sharding distributes a single large table across multiple databases, further scaling the system.

The article concludes with a roadmap for tackling large‑scale website data bottlenecks: single‑instance DB → read‑write separation → caching → search → vertical partitioning → horizontal sharding.

These steps provide a logical progression for architects to diagnose and resolve performance issues in high‑traffic web applications.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Caching read/write splitting Vertical Partitioning Memcached 503 error database bottleneck

Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.