Databases 15 min read

Understanding How MySQL Converts Uncorrelated Subqueries to Correlated Subqueries and Their Execution Plans

This article explains the process by which MySQL transforms uncorrelated IN subqueries into correlated subqueries, details the differences in EXPLAIN output columns, explores optimizer trace information, compares materialization versus correlated execution costs, and provides best‑practice recommendations for writing efficient queries.

Aikesheng Open Source Community

Dec 27, 2022

Understanding How MySQL Converts Uncorrelated Subqueries to Correlated Subqueries and Their Execution Plans

The article begins by introducing the conversion of uncorrelated subqueries (specifically IN subqueries) into correlated subqueries in MySQL 8.0.29, using the Sakila sample database for demonstration.

1. Explain type and ref column display logic

Two example queries are shown. The first is a hand‑crafted correlated subquery, and the second is an uncorrelated IN subquery that MySQL converts. The EXPLAIN output for each query is displayed, highlighting the type and ref columns.

-- Disable semi‑join optimization
SET optimizer_switch="semijoin=off";

-- Hand‑crafted correlated subquery
EXPLAIN SELECT * FROM city
WHERE city_id < 100 AND EXISTS (
  SELECT city_id FROM address
  WHERE city.city_id = address.city_id
);

-- Uncorrelated IN subquery (will be converted)
EXPLAIN SELECT * FROM city
WHERE city_id < 100 AND city_id IN (
  SELECT city_id FROM address
);

The type column shows ref for the correlated query and index_subquery for the converted query, while the ref column shows the actual field or func when the condition is generated by the conversion.

2. Optimizer trace

Enabling optimizer_trace reveals internal decision‑making during query preparation, optimization, and execution. The trace can be queried from information_schema.OPTIMIZER_TRACE. Important fields such as QUERY, TRACE, MISSING_BYTES_BEYOND_MAX_MEM_SIZE, and INSUFFICIENT_PRIVILEGES are described.

SET optimizer_trace = "enabled=on";
SELECT * FROM information_schema.OPTIMIZER_TRACE;

System variables optimizer_trace_offset and optimizer_trace_limit control how many recent statements are kept.

3. IN subquery conversion strategies

MySQL can choose among three strategies: convert to a semi‑join, materialize the subquery, or convert to a correlated subquery. When a semi‑join is not possible, the optimizer compares the cost of materialization versus correlated execution using the trace.

{
  "execution_plan_for_potential_materialization": {
    "subq_mat_decision": {
      "parent_fanouts": [{
        "select#": 1,
        "subq_attached_to_table": true,
        "table": "`city`",
        "fanout": 99,
        "cacheable": true
      }],
      "cost_to_create_and_fill_materialized_table": 123.849,
      "cost_of_one_EXISTS": 0.349669,
      "number_of_subquery_evaluations": 99,
      "cost_of_materialization": 133.749,
      "cost_of_EXISTS": 34.6172,
      "chosen": false
    }
  }
}

The trace shows that materialization is more expensive (cost ≈ 133.75) than executing the subquery as a correlated EXISTS (cost ≈ 34.62), so the optimizer selects the correlated approach.

Cost calculations are explained:

cost_of_materialization = cost_to_create_and_fill_materialized_table + number_of_subquery_evaluations × 0.1

(where 0.1 is memory_temptable_row_cost), and

cost_of_EXISTS = cost_of_one_EXISTS × number_of_subquery_evaluations

SELECT cost_name, cost_value, default_value
FROM mysql.server_cost
WHERE cost_name = 'memory_temptable_row_cost';

4. Execution flow of a correlated subquery

The main query reads rows from city, checks the non‑subquery predicates first, and for each qualifying row executes the correlated subquery (e.g., SELECT city_id FROM address WHERE address.city_id = 8). This process repeats for all qualifying rows.

SELECT * FROM city
WHERE city_id < 100 AND EXISTS (
  SELECT city_id FROM address
  WHERE city.city_id = address.city_id
);

The optimizer trace confirms that the subquery is executed once per qualifying outer row (99 times in the example).

5. Best practices

Place predicates that do not involve subqueries before those that do, so MySQL can filter rows early and avoid unnecessary subquery executions.

6. Summary

After conversion, type shows unique_subquery (alias for eq_ref) or index_subquery (alias for ref / ref_or_null).

The ref column may display func because the generated condition references the outer query field indirectly.

If a semi‑join is impossible, MySQL chooses between materialization and correlated execution based on estimated cost, visible via optimizer trace.

The article also outlines the execution steps of a correlated subquery and recommends ordering WHERE clauses for performance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

mysql Database Performance Optimizer_trace subquery

Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.