How to Turn Slow MySQL Queries into Millisecond Responses: Real‑World Optimization Case
This article walks through a real e‑commerce MySQL performance crisis, showing how to pinpoint bottlenecks, analyze slow‑query logs, use EXPLAIN, add composite indexes, rewrite SQL, apply partitioning, read/write splitting and caching, and achieve sub‑second response times with 99% CPU reduction.
Case Background: E‑commerce performance crisis
During a Double‑11 sale the order‑query API responded in 15‑30 seconds, CPU usage stayed above 90%, slow‑query log showed >300 entries per minute and user complaints surged 500%.
Step 1 – Identify bottlenecks
1.1 System monitoring
# Show process list
mysql> SHOW PROCESSLIST;
# Show slow‑query variables
mysql> SHOW VARIABLES LIKE 'slow_query%';
mysql> SHOW VARIABLES LIKE 'long_query_time';
# InnoDB status
mysql> SHOW ENGINE INNODB STATUS\GKey findings :
Active connections 512/800 (near limit)
Average query time 12.5 s
Frequent lock waits
1.2 Slow‑query log analysis
# Top 10 slowest queries
mysqldumpslow -s t -t 10 /var/log/mysql/mysql-slow.log
# Most frequent queries
mysqldumpslow -s c -t 10 /var/log/mysql/mysql-slow.logThe main offending query (redacted) performed a full‑table scan on orders and lacked suitable indexes, causing a filesort and scanning nearly 3 million rows.
Step 2 – EXPLAIN deep dive
EXPLAIN SELECT ... FROM orders o LEFT JOIN users u ON o.user_id=u.id ... WHERE o.create_time>='2023-11-01' AND o.status IN (1,2,3,4,5) ORDER BY o.create_time DESC LIMIT 20;Problems discovered : orders scanned with type=ALL
No covering index for the WHERE clause
Filesort used
~2.8 million rows examined
2.2 Index status
Existing indexes:
PRIMARY KEY (id)
KEY idx_user_id (user_id)
Missing indexes on create_time, status, and a composite index covering the query.
Step 3 – SQL optimisation
3.1 Composite index
# Create composite index (order matters)
ALTER TABLE orders ADD INDEX idx_status_createtime_id (status, create_time, id);
-- Reasoning:
-- 1. status used in WHERE (low cardinality)
-- 2. create_time is a range filter
-- 3. id allows ORDER BY to use index order, avoiding filesort3.2 Rewritten SQL (pagination optimisation)
-- Version 1: pagination with index‑aware subquery
SELECT o.*, u.username, p.product_name, p.price
FROM (
SELECT id FROM orders
WHERE create_time>='2023-11-01' AND status IN (1,2,3,4,5)
ORDER BY create_time DESC, id DESC
LIMIT 20
) o
LEFT JOIN users u ON o.user_id=u.id
LEFT JOIN order_items oi ON o.id=oi.order_id
LEFT JOIN products p ON oi.product_id=p.id
ORDER BY o.create_time DESC, o.id DESC;3.3 Delayed join version
SELECT o.id, o.user_id, o.total_amount, o.status, o.create_time,
u.username, p.product_name, p.price
FROM (
SELECT id, user_id, total_amount, status, create_time
FROM orders
WHERE create_time>='2023-11-01' AND status IN (1,2,3,4,5)
ORDER BY create_time DESC, id DESC
LIMIT 20
) o
LEFT JOIN users u ON o.user_id=u.id
LEFT JOIN order_items oi ON o.id=oi.order_id
LEFT JOIN products p ON oi.product_id=p.id;Performance tests showed the original query took 18.5 s, scanning 2.8 M rows with 85 % CPU. Adding the composite index reduced execution to 2.1 s (24 k rows, 45 % CPU). The delayed‑join version achieved 0.08 s, 20 rows scanned and only 15 % CPU – a 230‑fold speedup.
Step 4 – Advanced strategies
4.1 Partitioned table
CREATE TABLE orders_partitioned (
id BIGINT PRIMARY KEY,
user_id INT NOT NULL,
total_amount DECIMAL(10,2),
status TINYINT,
create_time DATETIME
) PARTITION BY RANGE (YEAR(create_time)*100 + MONTH(create_time)) (
PARTITION p202310 VALUES LESS THAN (202311),
PARTITION p202311 VALUES LESS THAN (202312),
PARTITION p202312 VALUES LESS THAN (202401),
PARTITION p_future VALUES LESS THAN MAXVALUE
);4.2 Read‑write splitting (Python example)
class DatabaseRouter:
def __init__(self):
self.master = get_master_connection()
self.slaves = get_slave_connections()
def execute_query(self, sql, is_write=False):
if is_write or self.is_write_operation(sql):
return self.master.execute(sql)
else:
slave = random.choice(self.slaves)
return slave.execute(sql)
def is_write_operation(self, sql):
write_keywords = ['INSERT', 'UPDATE', 'DELETE', 'ALTER']
return any(keyword in sql.upper() for keyword in write_keywords)4.3 Redis cache for order lists
class OrderCacheManager:
def __init__(self):
self.redis_client = redis.Redis(host='localhost', port=6379, db=0)
self.cache_ttl = 300 # 5 minutes
def get_orders(self, user_id, page=1, size=20):
cache_key = f"orders:{user_id}:{page}:{size}"
cached = self.redis_client.get(cache_key)
if cached:
return json.loads(cached)
orders = self.query_from_database(user_id, page, size)
self.redis_client.setex(cache_key, self.cache_ttl, json.dumps(orders, default=str))
return ordersStep 5 – Monitoring & alerting
# Prometheus + Grafana metrics (mysql_exporter)
mysql_global_status_slow_queries
mysql_global_status_threads_connected / mysql_global_variables_max_connections
rate(mysql_global_status_queries[5m])
mysql_info_schema_innodb_metrics_lock_timeoutsA Bash script can automatically alert when slow‑query count exceeds a threshold:
#!/bin/bash
slow_queries=$(mysql -e "SHOW GLOBAL STATUS LIKE 'Slow_queries';" | awk 'NR==2{print $2}')
if [ $slow_queries -gt 100 ]; then
echo "大量慢查询,开始分析..."
mysqldumpslow -s t -t 5 /var/log/mysql/mysql-slow.log > /tmp/slow_analysis.log
mail -s "数据库慢查询告警" [email protected] < /tmp/slow_analysis.log
fiResult summary
After applying the index, partitioning, read/write splitting and caching, average response time dropped from 18.5 s to 0.08 s (99.6 % reduction), CPU usage fell from >90 % to ~15 %, and slow‑query rate fell below 5 per minute. User satisfaction rose from 60 % to 95 %.
All repository links mentioned are technical references: GitHub: https://github.com/raymond999999 Gitee: https://gitee.com/raymond9
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Raymond Ops
Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
