Databases 9 min read

An Overview of MyCat: Open‑Source Distributed Database Middleware and Its Core Features

MyCat is an open‑source distributed database middleware that transparently shreds tables across multiple backend databases, solves connection overload, provides ER‑based sharding, global partitioning, AI‑driven catlets, and advanced read‑write separation, enabling low‑cost migration of single‑node databases to the cloud.

Art of Distributed System Architecture Design
Art of Distributed System Architecture Design
Art of Distributed System Architecture Design
An Overview of MyCat: Open‑Source Distributed Database Middleware and Its Core Features

Why MyCat is needed? In the cloud era traditional single‑node databases face performance limits, while NoSQL cannot fully replace them; scalable, sharded relational databases are required. MyCat aims to migrate existing single‑node databases to the cloud at low cost, addressing data‑storage bottlenecks as business scales.

Since its debut in 2014, MyCat has been adopted by over 60 projects (as of April 2015), mainly in telecom and internet sectors, handling tables with monthly volumes up to 3 billion rows.

What is MyCat? It is an open‑source distributed database system that implements the MySQL protocol. It acts as a database proxy for clients, forwarding MySQL or JDBC requests to multiple backend databases (MySQL, SQL Server, Oracle, DB2, PostgreSQL, MongoDB, etc.). Its core function is horizontal table partitioning (sharding) across backend nodes.

MyCat’s architecture (see images) shows how it abstracts various storage engines as ordinary relational tables, allowing standard SQL operations and dramatically reducing development effort.

Problems solved by MyCat:

Connection overload – MyCat centrally manages all data sources, making backend clusters transparent to front‑end applications.

ER‑based sharding – Parent‑child tables are routed to the same node, enabling cross‑node JOIN push‑down, a unique feature of MyCat.

Global partitioning – Each node can concurrently insert, update, and read data, improving read performance and cross‑node JOIN efficiency.

AI‑driven catlets – Supports complex cross‑shard SQL and stored procedures via special annotations.

Example annotations (code snippets):

/*!MyCat:catlet=demo.catlets.ShareJoin*/ select bu.*, sg.* from base_user bu, sam_glucose sg where bu.id_=sg.user_id;
/*!MyCat: sql=select * from base_user where id_=1;*/ CALL proc_test();
/*!MyCat:catlet=demo.catlets.BatchInsertSequence*/ insert into sam_test(name_) values('t1'),('t2');
/*!MyCat:catlet=demo.catlets.BatchGetSequence*/ SELECT MyCat_get_seq('MyCat_TEST',100);

These annotations enable cross‑shard joins, stored‑procedure execution, batch inserts with automatic primary‑key generation, and bulk sequence retrieval without explicit key handling in SQL.

MyCat also provides advanced read‑write separation based on MySQL replication lag (e.g., enabling reads when slave lag < 100) and automatically excludes unhealthy read hosts.

Technical principle: MyCat intercepts incoming SQL, performs sharding, routing, read‑write separation, and caching analysis, forwards the statement to the appropriate backend, processes the results, and returns a unified response to the client.

In practice, a table like Orders can be split into three data nodes across two MySQL servers using a string‑enumeration sharding rule on the prov column. MyCat parses the SQL, determines the target nodes, dispatches the query, aggregates results, and handles pagination, ORDER BY, and complex JOIN scenarios using its innovative ER sharding, global tables, and AI‑driven catlets.

Future roadmap: strengthen middleware capabilities with richer plugins, intelligent optimization, comprehensive monitoring, and operational tools for online scaling and migration; further integrate with big‑data stream engines (Spark Streaming, Storm) to provide fast OLAP operations, real‑time analytics, and advanced algorithms.

Cloud MigrationShardingDistributed DatabaseDatabase MiddlewareMycatAI catletSQL proxy
Art of Distributed System Architecture Design
Written by

Art of Distributed System Architecture Design

Introductions to large-scale distributed system architectures; insights and knowledge sharing on large-scale internet system architecture; front-end web architecture overviews; practical tips and experiences with PHP, JavaScript, Erlang, C/C++ and other languages in large-scale internet system development.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.