MySQL Query Optimization and Index Tuning: From Subqueries to Joins and Composite Indexes
This article demonstrates MySQL query performance issues with large tables, shows how a subquery without indexes can take hours, and walks through step‑by‑step optimizations including adding single‑column indexes, converting to EXISTS and JOIN queries, creating composite indexes, covering indexes, and sorting improvements, achieving speedups of thousands of times.
The author, a senior architect, presents a scenario using MySQL 5.6 with three tables (Course, Student, SC) that contain millions of rows.
Table definitions:
create table Course(
c_id int PRIMARY KEY,
name varchar(10)
); create table Student(
id int PRIMARY KEY,
name varchar(10)
); CREATE table SC(
sc_id int PRIMARY KEY,
s_id int,
c_id int,
score int
);The initial query aims to find students who scored 100 in a specific subject:
select s.* from Student s
where s.s_id in (
select s_id from SC sc
where sc.c_id = 0 and sc.score = 100
);This query took 30248.271 seconds. An EXPLAIN showed type=ALL for all tables, indicating full table scans and no index usage.
Adding single‑column indexes on the filter columns dramatically improved performance:
CREATE index sc_c_id_index on SC(c_id);
CREATE index sc_score_index on SC(score);After indexing, the same query executed in about 1.054 seconds, a speed‑up of over 30,000×.
The author then explored alternative query forms. Converting the subquery to an EXISTS clause and testing join queries revealed that MySQL can rewrite the query to use EXISTS internally, but join performance depends on proper indexing. Removing the previously created indexes and running a plain join resulted in slower execution (≈0.057 s) compared to the indexed version.
To further optimize, an index on the foreign‑key column was added:
CREATE index sc_s_id_index on SC(s_id);Testing the join again showed a slight increase in time (≈1.076 s) because the optimizer chose a different execution plan.
Recognizing that the combination of c_id and score has high selectivity, a composite index was created:
CREATE index sc_c_id_score_index on SC(c_id, score);With the composite index, the query time dropped to 0.007 seconds, confirming that multi‑column indexes can be far more efficient when the columns together provide good discrimination.
The article also discusses MySQL’s index‑merge optimization for multiple single‑column indexes, the left‑most prefix rule for composite indexes, covering indexes that allow the query to be satisfied entirely from the index, and the benefit of indexing columns used for ORDER BY . For example, adding an index on user_name reduced a sorted query time from 0.139 s to a much lower value.
Key takeaways:
Use appropriate data types and keep column lengths short.
Create single‑column indexes on filter columns.
When multiple columns are used together in predicates, create a composite index respecting the left‑most prefix.
Consider covering indexes to avoid fetching full rows.
Index join columns, WHERE columns, ORDER BY columns, and GROUP BY columns.
Avoid functions or calculations on indexed columns in WHERE clauses, as they prevent index usage.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.