Understanding Database Vertical and Horizontal Splitting, Their Challenges, and Practical Solutions
The article explains the concepts of vertical and horizontal database splitting, compares their technical difficulties, discusses the impact on joins and transactions, and proposes practical approaches such as careful SQL refactoring, ORM decoupling, and selective use of distributed‑transaction techniques to maintain system stability and scalability.
When a website reaches storage bottlenecks, developers often consider vertical and horizontal database splitting; vertical splitting moves different business units to separate databases, while horizontal splitting distributes the same business unit’s data across multiple databases.
Horizontal splitting is finer‑grained and technically more complex than vertical splitting, leading to higher implementation cost and risk, so simple solutions should be preferred whenever possible.
The author, a former Java engineer, shares experience that many enterprises limit database features (no foreign keys, limited stored procedures) and offload most computation to application code, emphasizing that storage is the primary concern for large‑scale internet systems.
Key challenges of horizontal splitting include modifying join queries that span split tables and handling transactions that become distributed, especially when logical deletes are implemented as updates.
Three practical methods are presented: (1) analyze and rewrite join queries, possibly duplicating tables when necessary; (2) rewrite affected SQL after splitting, using multi‑step queries and in‑memory merging; (3) adopt an ORM‑based data access layer to decouple service code from database changes.
Traditional ORM tools (e.g., Hibernate, MyBatis) target single databases and cannot fully address cross‑database operations, so a custom cross‑database ORM or a service‑layer API that abstracts database access is recommended.
Vertical splitting also introduces distributed transaction concerns; the article briefly mentions XA, CAP/BASE, and Paxos as theoretical foundations, warning that heavyweight distributed‑transaction solutions can become hard‑to‑control “time‑bombs.”
In summary, the author advises: prioritize storage over computation, migrate computation to application code, design a stable data‑access layer to isolate service logic from database changes, treat splitting as a step toward a distributed database architecture, and address distributed transactions with careful, problem‑specific analysis rather than generic heavyweight solutions.
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.