Databases 10 min read

SQL Refactoring Guide Based on Distributed Middleware

This article explains how distributed middleware processes simple SQL, ER‑based joins, and cross‑database joins, analyzes their performance impacts, and provides practical recommendations for optimizing SQL statements by using shard conditions, limiting clauses, low‑cardinality columns, and table ordering.

Aikesheng Open Source Community

Apr 17, 2019

SQL Refactoring Guide Based on Distributed Middleware

The content originates from a presentation at the GOPS Global Operations Conference, where the speaker shared a "SQL Refactoring Guide Based on Distributed Middleware" to help readers understand distributed architecture and SQL transformation practices.

1. Simple SQL in Middleware – Simple SQL is defined as a single‑table query. Middleware treats such queries like regular SQL but may split data according to configuration (e.g., sharding by user ID). When a query contains shard information, the middleware routes it to the appropriate DB; otherwise it broadcasts the query to all shards, waiting for the slowest response, which can degrade performance as the number of shards grows.

2. ER Table Join – When two tables have a logical parent‑child relationship (e.g., users and orders), the middleware defines an ER relationship. Queries that involve only tables linked by ER relationships are called "ER table JOIN". The middleware stores child‑table rows on the same DB node as the parent’s shard, allowing the query to be sent directly to that node. Limitations include one‑directional relationships and the need to declare ER links explicitly.

3. Cross‑Database Table JOIN – For queries that cannot be expressed as simple SQL or ER joins, the middleware performs a cross‑database join. It retrieves data from each involved DB, sorts and archives rows by join key, then compares keys to produce the final result. This approach incurs memory, CPU, and network overhead, and may generate large intermediate results (e.g., Cartesian products) if table order is not optimized.

To mitigate these costs, the article suggests adding sufficient filtering conditions, using low‑cardinality columns for joins, and manually adjusting table order to improve the execution plan.

4. Overall Recommendations – Use shard conditions whenever possible, minimize the amount of data processed in complex queries, and apply the above optimization techniques to achieve better performance in distributed middleware environments.

Additional community resources, PPT download links, and upcoming events are listed at the end of the article.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Optimization sql Database distributed middleware Cross-DB Join ER Join

Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.