Databases 21 min read

How I Cut a 20‑Second MySQL Query to 200 ms: A Step‑by‑Step Optimization Journey

When a critical e‑commerce report took over 20 seconds and triggered a flood of user complaints, a systematic investigation using EXPLAIN, slow‑query logs, and profiling revealed missing indexes, costly joins, and temporary tables, leading to a four‑stage rewrite that reduced execution time to sub‑second performance.

Raymond Ops
Raymond Ops
Raymond Ops
How I Cut a 20‑Second MySQL Query to 200 ms: A Step‑by‑Step Optimization Journey

Background and Problem

At 3 am an alert showed that a core business API response exceeded 20 seconds. The culprit was a seemingly simple MySQL query that aggregated order statistics for VIP merchants in Beijing over the past 30 days. The original query took 20.34 seconds, a 100× slowdown.

Schema Overview

-- Orders (5 M rows)
CREATE TABLE orders (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    order_no VARCHAR(32),
    merchant_id INT,
    user_id INT,
    amount DECIMAL(10,2),
    status TINYINT,
    created_at DATETIME,
    updated_at DATETIME
) ENGINE=InnoDB;

-- Order items (20 M rows)
CREATE TABLE order_items (
    id BIGINT PRIMARY KEY AUTO_INCREMENT,
    order_id BIGINT,
    product_id INT,
    quantity INT,
    price DECIMAL(10,2),
    created_at DATETIME
) ENGINE=InnoDB;

-- Merchants (100 k rows)
CREATE TABLE merchants (
    id INT PRIMARY KEY AUTO_INCREMENT,
    name VARCHAR(100),
    category VARCHAR(50),
    city VARCHAR(50),
    level TINYINT
) ENGINE=InnoDB;

Original Query

SELECT
    m.id,
    m.name,
    COUNT(DISTINCT o.id)   AS order_count,
    COUNT(DISTINCT o.user_id) AS user_count,
    SUM(o.amount)          AS total_amount,
    AVG(o.amount)          AS avg_amount,
    SUM(oi.quantity)       AS total_items
FROM merchants m
LEFT JOIN orders o   ON m.id = o.merchant_id
LEFT JOIN order_items oi ON o.id = oi.order_id
WHERE m.city = '北京'
  AND m.level = 3
  AND o.status = 1
  AND o.created_at >= DATE_SUB(NOW(), INTERVAL 30 DAY)
GROUP BY m.id, m.name
ORDER BY total_amount DESC
LIMIT 100;

Problem Diagnosis

1. EXPLAIN Analysis

+----+-------------+-------+------+---------------+------+---------+------+---------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows    | Extra                                        |
+----+-------------+-------+------+---------------+------+---------+------+---------+----------------------------------------------+
|  1 | SIMPLE      | m     | ALL  | NULL          | NULL | NULL    | NULL | 100000  | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | o     | ALL  | NULL          | NULL | NULL    | NULL | 5000000 | Using where; Using join buffer               |
|  1 | SIMPLE      | oi    | ALL  | NULL          | NULL | NULL    | NULL |20000000 | Using where; Using join buffer               |
+----+-------------+-------+------+---------------+------+---------+------+---------+----------------------------------------------+

All three tables performed full scans (type=ALL).

No indexes were used (key=NULL).

Temporary tables and filesort were created.

Resulted in a Cartesian‑product‑like row count of 100 k × 5 M × 20 M.

2. Slow‑Query Log

# Time: 2024-03-15T03:25:41.123456Z
# Query_time: 20.342387  Lock_time: 0.000123  Rows_sent: 100  Rows_examined: 25438921

Only 100 rows were returned after scanning 25 M rows, a 254 389× inefficiency.

3. Profiling

SET profiling = 1;
-- run the query
SHOW PROFILE;

Sending data: 18.5 s (90%).

Creating sort index: 1.2 s (6%).

Copying to temporary table: 0.6 s (3%).

Optimization Strategy (Four‑Step Plan)

Step 1 – Add Essential Indexes

-- Merchant table index
ALTER TABLE merchants ADD INDEX idx_city_level (city, level);

-- Orders composite index (order matters)
ALTER TABLE orders ADD INDEX idx_merchant_status_created (merchant_id, status, created_at);

-- Order items index
ALTER TABLE order_items ADD INDEX idx_order_id (order_id);

Index design principles:

Follow the left‑most prefix rule.

Place highly selective columns first.

Consider both WHERE and JOIN conditions.

Step 2 – Rewrite SQL (Filter First, Join Later)

SELECT
    m.id,
    m.name,
    t.order_count,
    t.user_count,
    t.total_amount,
    t.avg_amount,
    t.total_items
FROM merchants m
INNER JOIN (
    SELECT
        o.merchant_id,
        COUNT(DISTINCT o.id)   AS order_count,
        COUNT(DISTINCT o.user_id) AS user_count,
        SUM(o.amount)          AS total_amount,
        AVG(o.amount)          AS avg_amount,
        SUM(items.item_count) AS total_items
    FROM orders o
    LEFT JOIN (
        SELECT order_id, SUM(quantity) AS item_count
        FROM order_items
        GROUP BY order_id
    ) items ON o.id = items.order_id
    WHERE o.status = 1
      AND o.created_at >= DATE_SUB(NOW(), INTERVAL 30 DAY)
    GROUP BY o.merchant_id
) t ON m.id = t.merchant_id
WHERE m.city = '北京'
  AND m.level = 3
ORDER BY t.total_amount DESC
LIMIT 100;

This reduces the intermediate result set dramatically by applying the date and status filters before the joins.

Step 3 – Use Covering Index

-- Covering index that includes all selected columns
ALTER TABLE orders ADD INDEX idx_covering (merchant_id, status, created_at, id, user_id, amount);

The query can now be satisfied entirely from the index without a table lookup.

Step 4 – Materialized View (Pre‑Aggregated Summary Table)

-- Summary table
CREATE TABLE merchant_order_summary (
    merchant_id INT,
    summary_date DATE,
    order_count INT,
    user_count INT,
    total_amount DECIMAL(10,2),
    avg_amount DECIMAL(10,2),
    total_items INT,
    PRIMARY KEY (merchant_id, summary_date),
    INDEX idx_date (summary_date)
) ENGINE=InnoDB;

-- Hourly refresh (example)
INSERT INTO merchant_order_summary
SELECT
    merchant_id,
    DATE(created_at) AS summary_date,
    COUNT(DISTINCT id) AS order_count,
    COUNT(DISTINCT user_id) AS user_count,
    SUM(amount) AS total_amount,
    AVG(amount) AS avg_amount,
    (SELECT SUM(quantity) FROM order_items WHERE order_id IN (SELECT id FROM orders WHERE merchant_id = o.merchant_id)) AS total_items
FROM orders o
WHERE status = 1 AND created_at >= CURDATE()
GROUP BY merchant_id, DATE(created_at)
ON DUPLICATE KEY UPDATE
    order_count = VALUES(order_count),
    user_count = VALUES(user_count),
    total_amount = VALUES(total_amount),
    avg_amount = VALUES(avg_amount),
    total_items = VALUES(total_items);

-- Query the summary
SELECT
    m.id,
    m.name,
    SUM(s.order_count)   AS order_count,
    SUM(s.user_count)    AS user_count,
    SUM(s.total_amount)  AS total_amount,
    AVG(s.avg_amount)    AS avg_amount,
    SUM(s.total_items)  AS total_items
FROM merchants m
INNER JOIN merchant_order_summary s ON m.id = s.merchant_id
WHERE m.city = '北京' AND m.level = 3 AND s.summary_date >= DATE_SUB(CURDATE(), INTERVAL 30 DAY)
GROUP BY m.id, m.name
ORDER BY total_amount DESC
LIMIT 100;

Performance Gains

Original query: 20.34 s (no index).

After adding indexes: 8.5 s (≈2.4× faster).

After SQL rewrite: 2.3 s (≈8.8× faster).

After covering index: 0.8 s (≈25× faster).

After materialized view: 0.2 s (≈100× faster).

General Optimization Methodology

Diagnostic Three‑Step

EXPLAIN analysis – check type, key, and Extra fields.

Slow‑query log – enable and examine query_time, rows_examined, etc.

Profiling – use SET profiling=1 and SHOW PROFILE to locate bottlenecks.

Six‑Step Optimization Framework

Index optimization – create selective, left‑most, covering indexes.

SQL rewrite – filter before join, avoid SELECT *, use sub‑queries wisely.

Table structure – consider denormalization, proper data types, partitioning.

Cache strategy – use Redis or application‑level cache for hot data.

Read/write splitting – master‑slave replication and load balancing.

Data archiving – move historical data to cold storage.

Common Pitfalls

Over‑indexing – creates write overhead; prefer composite indexes.

Ignoring write performance – balance read‑optimizing indexes with write cost.

Cache stampede – randomize TTL to avoid massive simultaneous expirations.

Monitoring and Prevention

Slow‑Query Monitoring View

CREATE VIEW slow_query_monitor AS
SELECT
    DATE(start_time) AS query_date,
    LEFT(sql_text,100) AS query_sample,
    COUNT(*) AS exec_count,
    AVG(query_time) AS avg_time,
    MAX(query_time) AS max_time,
    SUM(rows_examined) AS total_rows_examined
FROM mysql.slow_log
GROUP BY DATE(start_time), LEFT(sql_text,100)
ORDER BY avg_time DESC;

Automated Alert Script (Python)

#!/usr/bin/env python3
import MySQLdb, smtplib
from email.mime.text import MIMEText

def check_slow_queries():
    db = MySQLdb.connect(host="localhost", user="monitor", passwd="password", db="mysql")
    cur = db.cursor()
    cur.execute("""
        SELECT COUNT(*) AS slow_count, AVG(query_time) AS avg_time
        FROM mysql.slow_log
        WHERE start_time >= DATE_SUB(NOW(), INTERVAL 1 HOUR)
    """)
    slow_count, avg_time = cur.fetchone()
    if slow_count > 100 or avg_time > 5:
        send_alert(f"Slow query alert: count={slow_count}, avg_time={avg_time}s")
    cur.close(); db.close()

def send_alert(message):
    msg = MIMEText(message)
    msg['Subject'] = 'MySQL Slow Query Alert'
    msg['From'] = '[email protected]'
    msg['To'] = '[email protected]'
    s = smtplib.SMTP('localhost')
    s.send_message(msg)
    s.quit()

if __name__ == "__main__":
    check_slow_queries()

Quick Diagnostic Bash Script

#!/bin/bash
echo "=== MySQL Performance Quick Check ==="

echo -e "
[1] Slow Query Settings:"
mysql -e "SHOW VARIABLES LIKE '%slow%';"

echo -e "
[2] Recent Slow Queries:"
mysql -e "SELECT * FROM mysql.slow_log ORDER BY start_time DESC LIMIT 5\G"

echo -e "
[3] Index Usage Stats:"
mysql -e "SELECT table_schema, table_name, index_name, cardinality FROM information_schema.STATISTICS WHERE table_schema NOT IN ('mysql','information_schema','performance_schema') ORDER BY cardinality DESC LIMIT 10;"

echo -e "
[4] Table Sizes:"
mysql -e "SELECT table_schema, table_name, ROUND(data_length/1024/1024,2) AS Data_MB, ROUND(index_length/1024/1024,2) AS Index_MB, ROUND((data_length+index_length)/1024/1024,2) AS Total_MB FROM information_schema.TABLES WHERE table_schema NOT IN ('mysql','information_schema','performance_schema') ORDER BY data_length+index_length DESC LIMIT 10;"

echo -e "
[5] Current Connections:"
mysql -e "SHOW STATUS LIKE '%connect%';"

echo "=== Check Complete ==="

Toolbox

MySQL built‑in tools: EXPLAIN, SHOW PROFILE, Performance Schema.

Third‑party: Percona Toolkit pt‑query‑digest, MySQLTuner, MySQL Workbench, Prometheus + Grafana.

Online analysis: mysqladmin processlist, INFORMATION_SCHEMA.INNODB_LOCKS, INFORMATION_SCHEMA.INNODB_TRX.

Best‑Practice Takeaways

Measure before you change – use logs and profiling to establish baselines.

Apply the 80/20 rule – focus on the few queries that consume most resources.

Iterative optimization – change one thing at a time, record impact, keep rollback plans.

Team Collaboration Advice

SQL Review Process – checklist for indexes, full scans, join count, SELECT *, sub‑query vs JOIN, growth considerations.

Development Standards – naming conventions, index naming, SQL style guide.

Knowledge Sharing – regular tech talks, maintain an internal case‑library, document patterns in a wiki.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

mysqlslow-querySQL Tuning
Raymond Ops
Written by

Raymond Ops

Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.