Databases 16 min read

How to Speed Up Slow SQL Queries: Choosing Driver Tables and Join Strategies

This article analyzes a slow SQL query by examining driver tables, execution plans, and table row counts, then demonstrates three optimization approaches—setting a small table as driver, setting a large table as driver, and switching to hash joins—showing their impact on execution time and providing practical guidelines for selecting the best strategy.

Aikesheng Open Source Community
Aikesheng Open Source Community
Aikesheng Open Source Community
How to Speed Up Slow SQL Queries: Choosing Driver Tables and Join Strategies

1. Problem Background

1.1 Problem Description

During SQL optimization, we often use driver tables or change the join method to improve performance. The following case study illustrates their differences and the conditions required for each scenario.

SQL duration: 11.25s

-- Simplified IN conditions
SELECT DISTINCT
    STORE_ID,
    PRODUCT_ID
FROM (
    SELECT DISTINCT
        ASP.PRODUCT_ID,
        t.STORE_ID
    FROM CT_ACT A
    JOIN CT_ACT_STAGE CAS ON A.ACT_ID = CAS.ACT_ID
    JOIN CT_ACT_STAGE_PRODUCT ASP ON CAS.STAGE_ID = ASP.STAGE_ID
    AND ASP.PRODUCT_STATUS = '1'
    AND ASP.PRODUCT_ID IN ('10185219','10382854')
    JOIN CT_STORE_PRODUCT_REL t ON ASP.PRODUCT_ID = t.PRODUCT_ID
    AND t.RELATIONSHIP_STATUS = '01'
    AND t.STORE_ID IN ('299800000149313','299800000148811','a2f162ae0fbe47c9b7b762ed27deb9b1','7787a5cb102744088f46b381ee667fd9')
);

1.2 View Execution Plan

===============================================================================================
|ID|OPERATOR            |NAME                                            |EST. ROWS|COST|
-----------------------------------------------------------------------------------------------
|0 |MERGE DISTINCT      |                                                |1        |565 |
|1 |SORT                |                                                |1        |565 |
|2 |NESTED-LOOP JOIN    |                                                |1        |565 |
|3 |NESTED-LOOP JOIN    |                                                |1        |553 |
|4 |NESTED-LOOP JOIN    |                                                |1        |276 |
|5 |TABLE SCAN          |ASP(IDX_CT_ACT_STAGE_PRODUCT_PRODUCT_ID)       |1        |264 |
|6 |TABLE GET           |CAS                                             |1        |24  |
|7 |TABLE SCAN          |t(IDX_CT_STORE_PRODUCT_REL_STATUS_STORE_ID_PRODUCT_ID)|1        |552 |
|8 |TABLE SCAN          |A(UK_CT_ACT_ACT_ID)                             |1        |24  |
===============================================================================================
Outline Data:
BEGIN_OUTLINE_DATA
NO_USE_HASH_AGGREGATION(@"SEL$2")
LEADING(@"SEL$2" ((("nctmbasedb.ASP"@"SEL$2" "nctmbasedb.CAS"@"SEL$2" )"nctmbasedb.t"@"SEL$2" )"nctmbasedb.A"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.t"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.CAS"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.ASP"@"SEL$2" ))
INDEX(@"SEL$2" "nctmbasedb.ASP"@"SEL$2" "IDX_CT_ACT_STAGE_PRODUCT_PRODUCT_ID")
FULL(@"SEL$2" "nctmbasedb.CAS"@"SEL$2" )
INDEX(@"SEL$2" "nctmbasedb.A"@"SEL$2" "UK_CT_ACT_ACT_ID")
INDEX(@"SEL$2" "nctmbasedb.t"@"SEL$2" "IDX_CT_STORE_PRODUCT_REL_STATUS_STORE_ID_PRODUCT_ID")
END_OUTLINE_DATA

1.3 Check Table Row Counts

Table Name

Alias

Row Count

CT_ACT

A

9912

CT_ACT_STAGE

CAS

116467

CT_ACT_STAGE_PRODUCT

ASP

9737755

CT_STORE_PRODUCT_REL

t

6563720

1.4 Analysis Process

All tables use indexes; CAS uses its primary key index, so its join columns are not examined. ASP is the driver table; using a large table as driver is sub‑optimal.

Two large tables ASP and t have highly selective WHERE conditions.

View Outline Data:

USE_NL(@"SEL$2" ("nctmbasedb.A"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.t"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.CAS"@"SEL$2" ))

Note: ASP is the driver table, so its join condition is not displayed.

LEADING(@"SEL$2" (("nctmbasedb.ASP"@"SEL$2" "nctmbasedb.CAS"@"SEL$2" )"nctmbasedb.t"@"SEL$2" )"nctmbasedb.A"@"SEL$2" )

The optimizer should first join ASP and CAS in the nctmbasedb schema.

Then join t with the result of ASP and CAS.

Finally join A with the previous result.

Summary

Tables A, t, and CAS use LNESTED-LOOP JOIN. A and CAS have small row counts, so NLJ is expected.

Table t is large, but its WHERE filter on t.STORE_ID is effective, so a hash join is considered. ASP is a large driver table, which is not optimal.

2. SQL Optimization

2.1 Option 1: Set Small Table (A) as Driver

2.1.1 Specify Driver Table

/*+leading(A) use_nl(A,CAS,ASP,t) */

The query ran longer than 30 seconds and was manually aborted.

The execution plan is equivalent to:

LEADING(@"SEL$2" ((("nctmbasedb.A"@"SEL$2" "nctmbasedb.ASP"@"SEL$2" "nctmbasedb.CAS"@"SEL$2" ))"nctmbasedb.t"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.t"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.CAS"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.ASP"@"SEL$2" ))

Table t is large and uses USE_NL, causing the timeout.

Table A is small but lacks a direct WHERE filter, so it cannot use an index efficiently and is not a good driver.

2.1.2 View Execution Plan

===========================================================================================================
|ID|OPERATOR            |NAME                                            |EST. ROWS|COST |
-----------------------------------------------------------------------------------------------------------
|0 |MERGE DISTINCT      |                                                |1        |127753|
|1 |SORT                |                                                |1        |127753|
|2 |NESTED-LOOP JOIN    |                                                |1        |127753|
|3 |NESTED-LOOP JOIN    |                                                |1        |127474|
|4 |NESTED-LOOP JOIN    |CARTESIAN                                        |4956     |5911 |
|5 |TABLE SCAN          |A(UK_CT_ACT_ACT_ID)                             |9912     |3835 |
|6 |MATERIAL            |                                                |1        |264  |
|7 |TABLE SCAN          |ASP(IDX_CT_ACT_STAGE_PRODUCT_PRODUCT_ID)        |1        |264  |
|8 |TABLE GET           |CAS                                             |1        |24   |
|9 |TABLE SCAN          |t(IDX_CT_STORE_PRODUCT_REL_STATUS_STORE_ID_PRODUCT_ID)|1        |552 |
===========================================================================================================
Used Hint:
LEADING(@"SEL$2" ((("nctmbasedb.A"@"SEL$2" ))))
USE_NL(@"SEL$2" ("nctmbasedb.t"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.CAS"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.ASP"@"SEL$2" ))

2.2 Option 2: Set Large Table (t) as Driver

2.2.1 Set t Table as Driver

Execution time: 0.19 s /*+leading(t (ASP,CAS)A) use_nl(t,A,CAS,ASP) */ Although t is large, it has an effective filter condition.

LEADING(@"SEL$2" (("nctmbasedb.t"@"SEL$2" ("nctmbasedb.ASP"@"SEL$2" "nctmbasedb.CAS"@"SEL$2" ))"nctmbasedb.A"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.A"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.CAS"@"SEL$2" ))

2.2.2 View Execution Plan

===============================================================================================
|ID|OPERATOR            |NAME                                            |EST. ROWS|COST |
-----------------------------------------------------------------------------------------------
|0 |MERGE DISTINCT      |                                                |1        |4624 |
|1 |SORT                |                                                |1        |4624 |
|2 |NESTED-LOOP JOIN    |                                                |1        |4624 |
|3 |HASH JOIN           |                                                |1        |4612 |
|4 |TABLE SCAN          |t(IDX_CT_STORE_PRODUCT_REL_STATUS_STORE_ID_PRODUCT_ID)|3414     |1848 |
|5 |NESTED-LOOP JOIN    |                                                |1        |276 |
|6 |TABLE SCAN          |ASP(IDX_CT_ACT_STAGE_PRODUCT_PRODUCT_ID)        |1        |264 |
|7 |TABLE GET           |CAS                                             |1        |24  |
|8 |TABLE SCAN          |A(UK_CT_ACT_ACT_ID)                             |1        |24 |
===============================================================================================
Used Hint:
LEADING(@"SEL$2" (("nctmbasedb.t"@"SEL$2" ("nctmbasedb.ASP"@"SEL$2" "nctmbasedb.CAS"@"SEL$2" ))"nctmbasedb.A"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.A"@"SEL$2" ))
USE_NL(@"SEL$2" ("nctmbasedb.CAS"@"SEL$2" ))

2.3 Option 3: Change Join Method

2.3.1 Switch to Hash Join

Adding /*+use_hash(t,ASP) */ reduces execution time to 0.06 s.

/*+use_hash(t,ASP) */

Used Hint shows:

USE_HASH(@"SEL$2" ("nctmbasedb.t"@"SEL$2" ))

2.3.2 View Execution Plan

===============================================================================================
|ID|OPERATOR            |NAME                                            |EST. ROWS|COST |
-----------------------------------------------------------------------------------------------
|0 |MERGE DISTINCT      |                                                |1        |4351 |
|1 |SORT                |                                                |1        |4351 |
|2 |HASH JOIN           |                                                |1        |4351 |
|3 |NESTED-LOOP JOIN    |                                                |1        |289 |
|4 |NESTED-LOOP JOIN    |                                                |1        |276 |
|5 |TABLE SCAN          |ASP(IDX_CT_ACT_STAGE_PRODUCT_PRODUCT_ID)        |1        |264 |
|6 |TABLE GET           |CAS                                             |1        |24 |
|7 |TABLE SCAN          |t(IDX_CT_STORE_PRODUCT_REL_STATUS_STORE_ID_PRODUCT_ID)|3414     |1848 |
===============================================================================================
Used Hint:
USE_HASH(@"SEL$2" ("nctmbasedb.t"@"SEL$2" ))

2.4 Summary of Optimizations

Optimization Method

Hint

Effect

Set small table (A) as driver

/*+leading(A) use_nl(A,CAS,ASP,t) */

SQL ran >30 s, aborted

Set large table (t) as driver

/*+leading(t (ASP,CAS)A) use_nl(t,A,CAS,ASP) */

Execution time 0.19 s

Switch to hash join

/*+use_hash(t,ASP) */

Execution time 0.06 s

3. Summary

3.1 Hash Join Use Cases

Large tables;

Effective filter conditions that reduce rows dramatically after filtering.

Both conditions must be satisfied.

3.2 Driver Table Distinction

LEADING(@"SEL$2" (("nctmbasedb.ASP"@"SEL$2" "nctmbasedb.CAS"@"SEL$2" )"nctmbasedb.t"@"SEL$2" )"nctmbasedb.A"@"SEL$2" ))

In the LEADING hint, the leftmost table inside the outermost parentheses is the driver table.

3.3 Driver Table Usage Questions

3.3.1 Is a Small Table Always Suitable as Driver?

Answer: Small tables are usually good drivers, but not always.

No or poor indexes

Uneven data distribution

Complex join conditions

3.3.2 Is a Large Table Never Suitable as Driver?

Answer: Large tables can be drivers in certain situations.

Strong filter conditions

Hash join with uniform data distribution

Well‑designed indexes on the large table

Cost‑based optimizer selection

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

mysqlSQL optimizationHash JoinOceanBaseDriver Tablejoin strategy
Aikesheng Open Source Community
Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.