Databases 9 min read

Understanding MySQL 5.7 Crash Recovery and Table‑Space Validation Overhead

This article investigates why MySQL 5.7 crash recovery can take hours on a development instance with thousands of tables, detailing stack‑frame analysis, GDB debugging, InnoDB source inspection of validation logic, performance testing, and methods to bypass the table‑space verification step.

Aikesheng Open Source Community
Aikesheng Open Source Community
Aikesheng Open Source Community
Understanding MySQL 5.7 Crash Recovery and Table‑Space Validation Overhead

When a MySQL 5.7 development instance with about 1,500 databases and tens of thousands of tables crashes, the recovery process can stall for two hours, prompting the question of what MySQL is doing during that time.

Stack‑frame analysis using pstack and pt‑pmp revealed a call chain that traverses many InnoDB functions, including pread64 , os_file_io , Datafile::validate_first_page , and ultimately innobase_start_or_create_for_mysql . The pattern suggested that MySQL validates each tablespace file during crash recovery.

GDB debugging compared a normal startup with a crash‑recovery startup. In a normal start the function Datafile::validate_to_dd is skipped, while during crash recovery the debugger stops repeatedly inside this function, indicating a loop that iterates over all tablespaces.

Source inspection showed that the fil_ibd_open function calls Datafile::validate_to_dd only when a validate flag is true. This flag is set in innobase_start_or_create_for_mysql based on two parameters: recv_needed_recovery and srv_force_recovery .

The flag is computed as:

bool validate = recv_needed_recovery && srv_force_recovery == 0;

During a normal shutdown recv_needed_recovery remains 0, so validate is false and the tablespace check is skipped. After an abnormal termination the log sequence numbers differ, recv_needed_recovery becomes 1, and the validation runs.

The function dict_check_tablespaces_and_store_max_id then walks through every tablespace recorded in the data dictionary, opening each .ibd header page to verify consistency.

Performance testing with sysbench creating 500,000 empty tables confirmed that recovery time grows linearly with the number of tables and is further affected by disk IOPS. MySQL 8 introduces multithreaded scanning for >50k tables and a parameter innodb_validate_tablespace_paths to disable the validation on normal startup.

Skipping the validation can be done by setting innodb_force_recovery (making srv_force_recovery non‑zero, thus validate = false ), or by using a shared tablespace instead of per‑table tablespaces, which eliminates the need to open thousands of .ibd files. A temporary GDB hack to flip the validate variable works only in debug builds and is not practical for production.

References: https://dev.mysql.com/worklog/task/?id=7142 http://blog.symedia.pl/2015/11/mysql-56-and-57-crash-recovery.html https://www.percona.com/community-blog/2019/07/23/impact-of-innodb_file_per_table-option-on-crash-recovery-time/ https://jira.mariadb.org/browse/MDEV-18733

PerformanceInnoDBmysqlGDBCrash RecoveryTable Space Validation
Aikesheng Open Source Community
Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.