Databases 14 min read

Root Cause Analysis and Optimization of Slow SQL in Meituan‑Dianping MySQL Thread‑Pool

This article presents a detailed investigation of intermittent slow SQL queries observed in Meituan‑Dianping’s services, identifies the MySQL 5.6 thread‑pool as the primary cause, and demonstrates how adjusting thread‑pool parameters dramatically reduced slow‑query counts and improved overall service availability.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Root Cause Analysis and Optimization of Slow SQL in Meituan‑Dianping MySQL Thread‑Pool

Author Tian Dongxue, a database architect at Meituan‑Dianping, shares a real‑world case where occasional slow SQL queries caused timeouts and degraded QoS for a large internet‑plus life‑service platform.

Problem description : Developers reported high SQL latency (99th‑percentile exceeding 100 ms) despite the queries being simple primary‑key lookups on small tables, with no entries in the MySQL slow‑log.

Monitoring data from the CAT system showed many queries with response times over 100 ms, some reaching 929 ms, and a total of 3,788 slow queries on a single day.

Analysis : By capturing timestamps at four points (APP server send, MySQL receive, MySQL send, APP receive) using TcpDump, the majority of latency was traced to the MySQL server itself (≈606 ms). Further investigation revealed that the slow queries were not logged, indicating the delay occurred before the storage engine phase, likely in the thread‑handling layer.

MySQL 5.6 introduced a thread‑pool mechanism. Examination of thread‑related parameters (thread_cache_size, thread_pool_* etc.) suggested the thread‑pool was the culprit.

Solution iterations :

Disable the thread‑pool (thread_handling=one‑connection‑per‑thread) – reduced slow queries from 3,788 to 818.

Fine‑tune thread‑pool parameters, especially thread_pool_stall_limit , lowering it to 10 ms – further reduced slow queries to 63.

These adjustments eliminated over 98 % of the slow SQL incidents and restored service latency to sub‑10 ms levels.

Results :

State

Slow SQL count

Remarks

Before optimization

3788

After disabling thread‑pool

818

After setting thread_pool_stall_limit=10

63

The findings demonstrate a systematic approach to diagnosing end‑to‑end database performance issues: isolate the latency source, use exclusion methods, and adjust low‑level MySQL parameters.

Takeaways : Properly configuring MySQL thread‑pool settings is critical for high‑concurrency environments; monitoring tools and packet captures are essential for pinpointing bottlenecks.

References :

https://my.oschina.net/andylucc/blog/820624

https://yq.aliyun.com/articles/41078

http://blog.chinaunix.net/uid-28364803-id-3427833.html

http://blog.chinaunix.net/uid-28364803-id-3431242.html

https://www.percona.com/doc/percona-server/5.6/performance/threadpool.html

https://mariadb.com/kb/en/mariadb/thread-pool-in-mariadb/

https://www.safaribooksonline.com/library/view/high-performance-mysql/9781449332471/ch01.html#mysqlas_logical_architecture

performance tuningMySQLDatabase Optimizationthread poolSlow SQL
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.