Why and How to Split Applications and Databases: Practical Guidance and Best Practices
This article explains the reasons for splitting monolithic applications, outlines multi‑dimensional preparation, defines service boundaries, details DB vertical and horizontal splitting, global ID generation, migration steps, cut‑over strategies, consistency guarantees, and stability measures to ensure a successful transformation.
Why Split?
Severe coupling between applications, poor business extensibility, outdated and hard‑to‑maintain code, limited system scalability, and a growing number of pitfalls make monolithic systems unsustainable.
Preparation Before Splitting
Understanding Business Complexity
Assess the relationship between system and business, treat the system like a heart‑pacemaker rather than a replaceable vehicle, and involve product, development, and business teams to achieve consensus on a platform‑style architecture.
Defining Service Boundaries
Adopt high cohesion, low coupling, and single‑responsibility principles; use analogies such as independent abilities of the "Huluwa" brothers to illustrate clear boundaries that can later be combined into a unified platform.
Setting Post‑Split Goals
Determine concrete objectives for each split, such as separating databases and applications in the first phase and redesigning data models later, to avoid endless deepening of the split and maintain team morale.
DB Splitting Practice
Vertical splitting moves tables to dedicated databases (e.g., separating message tables from organization tables), while horizontal splitting shards large tables across multiple databases.
Global ID Generation
Replace auto‑increment primary keys with globally unique IDs (e.g., Snowflake, dedicated MySQL tables, dual‑table odd/even strategy) to prevent primary‑key conflicts during migration and rollback.
Example of conflict scenario:
New Table Creation & Data Migration
Use UTF‑8MB4 charset, ensure all indexes are created, perform full data sync during low‑traffic periods, then use binlog incremental sync tools (e.g., Alibaba Canal/OTTER) for ongoing changes.
SQL Refactoring for Cross‑DB Joins
Before cutting over, rewrite hundreds of join queries to avoid unsupported cross‑database joins, employing strategies such as business decoupling, global tables, redundant fields, or in‑memory data stitching via RPC or local caches.
Cut‑Over Schemes
DB Stop‑Write Scheme : Pause writes, migrate, and resume; fast and low‑cost but risky during peak periods and prone to rollback issues.
Dual‑Write Scheme : Write to both old and new tables simultaneously, allowing online operation and easier rollback, at the cost of longer process and increased latency.
Switch Management
Initialize feature switches to null to avoid default‑value pitfalls that could cause dirty data during restarts.
Ensuring Consistency After Splitting
Options include avoiding distributed transactions due to performance, using message‑based compensation, or scheduled task compensation to achieve eventual consistency.
Ensuring Stability After Splitting
Adopt defensive programming, set timeouts, convert strong dependencies to weak ones, design minimal‑exposure interfaces, implement flow control, and establish SOPs for incident handling.
Summary
Prepare for pressure, decompose complex problems into testable, rollback‑able steps, and always have a SOP ready because the issues you fear will inevitably occur.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.