An Overview of Mycat Distributed Database Middleware: Architecture, Deployment, Sharding, and High Availability
This article introduces Mycat, an open‑source distributed database middleware that implements the MySQL protocol, explains its core concepts such as sharding, read/write separation, high availability, deployment architectures, optimization strategies, and provides practical configuration examples for schema, rule, and server files.
Mycat is an open‑source distributed database system that implements the MySQL protocol, acting as a proxy server; it does not store data itself but forwards SQL to backend MySQL instances, with its core function being horizontal and vertical sharding of tables across multiple databases.
From three perspectives: DBAs view Mycat as a MySQL server whose backend storage engines (InnoDB, MyISAM, etc.) guarantee data reliability; developers can connect to Mycat using standard MySQL clients (default port 8066) and should prefer simple SQL for sharded tables to achieve optimal performance; architects see Mycat as a powerful middleware enabling read/write separation, multi‑tenant development, and flexible architecture with upcoming intelligent optimization modules.
Typical deployment patterns include a general overall deployment and a dual‑active (双活) deployment across two data centers, illustrated by the following diagrams:
Database sharding optimization is discussed in two dimensions: vertical sharding (splitting by business modules into separate databases) and horizontal sharding (splitting large tables into many smaller tables based on rules such as user ID, date, or other fields). Access can be handled either by configuring multiple data sources in each application (client‑side mode) or by using a proxy layer (Mycat) that makes the sharding transparent to the application.
The main components of Mycat include logical schemas, tables, users, and DataNodes (physical MySQL instances). When a SQL request arrives, Mycat parses and routes it according to defined sharding rules, merges results from multiple DataNodes if needed, and returns the data to the client using a non‑blocking NIO model.
Key concepts of the middleware are:
Database middleware that abstracts distributed data as a single logical database.
Logical tables: sharding tables, non‑sharding tables, ER tables (parent‑child relationship), and global tables (small, rarely changed data replicated across all nodes).
Shard nodes (DataNode) that may consist of master‑slave pairs for high availability.
Shard rules that define how large tables are partitioned.
Configuration is performed via three XML files:
<schema.xml> – defines logical databases, tables, and DataNodes.
<rule.xml> – defines sharding algorithms and rules.
<server.xml> – defines user credentials, ports, and system variables.
Implementation of sharding is illustrated with a case of t_user and its child table t_user_detail :
<table name="t_user" dataNode="dn$1-32" rule="mod-long">
<childTable name="t_user_detail" primaryKey="id" joinKey="user_id" parentKey="user_id" />
</table>This configuration distributes t_user across DataNodes dn1‑dn32 using a modulo‑long rule, while t_user_detail is co‑located with its parent rows, enabling efficient local joins.
Global tables (e.g., t_area ) are defined as:
<table name="t_area" primaryKey="id" type="global" dataNode="dn1,dn2" />MySQL master‑slave replication is mentioned as a complementary technique for read/write separation, with replication modes (synchronous, asynchronous, semi‑asynchronous) illustrated by an accompanying diagram:
High‑availability architecture is also shown:
Mycat provides operational tools (Mycat‑Eye) for configuration management, monitoring (traffic, connections, threads, buffers, TPS, memory, JVM), and SQL analysis (read/write ratios, latency, top‑consuming statements). These features help administrators maintain performance and reliability of the distributed database system.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.