Databases 12 min read

Investigation of MySQL Slave SQL Thread IO Spikes Caused by Concurrent InnoDB Tablespace Extensions

This article analyzes intermittent IO write spikes in a MySQL slave instance, tracing them to concurrent InnoDB tablespace extensions triggered by large BLOB inserts across thousands of tables, and proposes mitigation strategies.

Tencent Database Technology
Tencent Database Technology
Tencent Database Technology
Investigation of MySQL Slave SQL Thread IO Spikes Caused by Concurrent InnoDB Tablespace Extensions

Problem description: A business CDB instance exhibits daily IO write spikes between 00:07:00 and 00:08:00 on both master and slave machines, confirmed by monitoring.

Analysis steps: Using iotop on the slave, the heavy write thread is identified as thread 145378, belonging to the mysqld process and specifically the SQL thread responsible for single‑threaded relay log replay.

Thread 85 (Thread 0x7f68c4c4c700 (LWP 145378)): #0  0x00007fa2badd3945 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #4  exec_relay_log_event (rli=0x1771c43c8, thd=0x7f68b0000990)   #5  handle_slave_sql (arg=0x1771c3010)

Further investigation rules out binlog writes (log_slave_updates=OFF), redo/undo logging (by disabling redo log), relay log purge, and user‑level page flush; the IO persists.

Sys schema output shows massive redo log file write counts but low per‑time‑unit values, indicating the spike is not from redo logging.

/data1/mysql_root/log/20120/ib_logfile0 5       4.00 KiB        819 bytes       7439952   57.47 GiB       8.10 KiB        57.47 GiB       100.00 /data1/mysql_root/log/20120/ib_logfile1 2       64.50 KiB       32.25 KiB       3025254   23.39 GiB       8.11 KiB        23.39 GiB       100.00

Disabling redo log writes and observing unchanged IO confirms redo is not the cause; undo and ibdata growth are also negligible.

Examining the table schema reveals over 10,000 tables, each with a mediumblob or blob column, no secondary indexes, and insert‑heavy workload.

The hypothesis: large BLOB inserts cause page splits, triggering InnoDB to extend the tablespace (ibd file) by allocating new extents (each extent = 1 MB, 4 extents = 4 MB per extend). With thousands of tables extending concurrently, the cumulative write volume spikes.

#0  os_aio_func (type=..., mode=mode@entry=24,    name=0x7ee4493f97b8 "./DB_xxx/xxx_36.ibd", file=..., buf=buf@entry=0x7ee453ed4000, offset=offset@entry=122683392, n=n@entry=1048576, read_only=read_only@entry=false, m1=m1@entry=0x0, m2=m2@entry=0x0) #1  0x0000000001187922 in fil_write_zeros (node=0x7ee4493f5338, node=0x7ee4493f5338, read_only_mode=<optimized out>,    len=4194304, start=121634816, page_size=<optimized out>) #2  fil_space_extend (space=space@entry=0x7ee448d8dbd8, size=<optimized out>) #3  0x00000000007701d1 in fsp_try_extend_data_file (space=space@entry=0x7ee448d8dbd8, header=header@entry=0x7ef4a4ff0026 "", #4  0x000000000118f0df in fsp_reserve_free_extents (n_reserved=n_reserved@entry=0x7f20868f4fa0, space_id=9054, n_ext=3,    alloc_type=alloc_type@entry=FSP_NORMAL, mtr=mtr@entry=0x7f20868f5890, n_pages=n_pages@entry=2) #5  0x00000000010e6394 in btr_cur_pessimistic_insert (flags=flags@entry=0, cursor=cursor@entry=0x7f20868f5150,    offsets=offsets@entry=0x7f20868f50b0, heap=heap@entry=0x7f20868f50a0, entry=entry@entry=0x7ee453143488,    rec=rec@entry=0x7f20868f5570, big_rec=big_rec@entry=0x7f20868f5090, n_ext=n_ext@entry=0, thr=thr@entry=0x7ee4544623f0,    mtr=mtr@entry=0x7f20868f5890)

Each extend operation writes 4 MB; if ~500 tables extend in the same second, ~2 GB of write IO is generated, matching the observed spikes.

Verification: Adding a custom InnoDB log that warns when 100 MB of extension occurs within one second shows frequent extensions of 0.1‑0.2 seconds to reach 100 MB, correlating with the IO spikes captured by iotop.

Conclusion: The IO write spikes are caused by concurrent InnoDB tablespace extensions due to large BLOB inserts across many tables, not by traditional logging or flush mechanisms.

Mitigation suggestions: (1) Reduce the number of tables or merge them to lower concurrent extension pressure; (2) Improve MySQL’s tablespace allocation algorithm to reduce concurrent extends; (3) Allow users to pre‑allocate or preset initial tablespace sizes to avoid dynamic extensions during peak loads.

Performance TuningInnoDBmysqlblobIO spikeslave SQL threadtablespace extension
Tencent Database Technology
Written by

Tencent Database Technology

Tencent's Database R&D team supports internal services such as WeChat Pay, WeChat Red Packets, Tencent Advertising, and Tencent Music, and provides external support on Tencent Cloud for TencentDB products like CynosDB, CDB, and TDSQL. This public account aims to promote and share professional database knowledge, growing together with database enthusiasts.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.