Mastering TiDB SQL Layer: Architecture, Optimization, and Real-World Cases
This article explains TiDB's SQL layer architecture, details how SQL requests are processed and optimized, demonstrates the use of EXPLAIN and EXPLAIN ANALYZE, and presents two real‑world case studies that illustrate common performance pitfalls and practical optimization techniques.
SQL Layer Architecture
In TiDB, a user's SQL request is sent directly or via a load balancer to the TiDB Server, which parses the MySQL protocol packet, performs syntax and semantic analysis, creates and optimizes a query plan, executes the plan, and retrieves data from the TiKV cluster before returning results to the user.
SQL Optimization Process Overview
TiDB parses the original query text, applies logical equivalence transformations to produce a more tractable logical plan, then selects a final physical plan based on data distribution and operator cost. When executing PREPARE statements, TiDB can cache plans to reduce planning overhead.
Using EXPLAIN to View Execution Plans
TiDB's EXPLAIN output consists of five columns—id, estRows, task, access object, and operator info—each describing an operator in the plan.
EXPLAIN ANALYZE Output Format
Unlike EXPLAIN, EXPLAIN ANALYZE actually runs the SQL, records runtime information, and returns additional columns such as actRows, execution info, memory, and disk usage.
SQL Optimization Case Best Practices
Case 1: Index Mis‑selection Causing Slow SQL
Scenario: After migrating a database to TiDB, a query that runs under 1 s on MySQL exceeds 30 s on TiDB.
The execution plan shows that the slow part lies in the innermost three joins (m join d) join taskd. By reordering these joins and forcing TiDB to use the same index as MySQL with a USE INDEX hint, the query time drops dramatically.
<code>INNER JOIN taskd
ON taskd.no = d.no
AND taskd.o_no = d.o_no
AND taskd.d_no = d.d_no
AND taskd.w_no = d.w_no
AND taskd.g_no = d.g_no
AND IF(NULL = d.MD5_VALUE, 1, d.MD5_VALUE) = IF(NULL = d.MD5_VALUE, 1, taskd.MD5_VALUE)
AND taskd.yn = 0</code>Optimization Ideas
1. Observe EXPLAIN ANALYZE results and identify the slow inner joins.
2. Compare with MySQL's plan; adjust the join order of the three tables accordingly.
3. After reordering, add a USE INDEX hint to make TiDB follow the same index as MySQL.
Case 2: Wrong Join Order Causing Slow SQL
Scenario: A query runs in milliseconds on MySQL but takes 18 s on TiDB.
Before and after optimization execution plans are shown, with the final plan using a hint to improve performance.
Optimization Ideas
1. TiDB spends over 10 s because it inaccurately estimates the wps table, causing a costly hash join and long back‑lookup on the pri table.
2. The inaccurate estimate stems from TiDB converting a range scan on w into a point lookup and using index statistics for estimation.
3. The point‑lookup estimate relies on CMSketch; hash collisions in the sketch may lead to large errors.
Extended Discussion
Adding hints can improve plans but hard‑codes SQL, making changes difficult for ORM‑based applications. SQL Binding (SPM) allows DBAs to adjust execution plans without modifying the SQL text, offering a more elegant optimization approach.
Jingdong Cloud, together with PingCAP, offers a TiDB cloud service that supports both OLTP and OLAP workloads, provides automatic horizontal scaling, strong consistency, simple deployment, online schema changes, and full MySQL protocol compatibility, reducing migration cost.
JD Cloud Developers
JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.