Databases 16 min read

Optimizing Xtrabackup Recovery Process for InnoDB Databases

Xtrabackup is an open-source hot backup tool for InnoDB and XtraDB databases, offering non-blocking backups with features like fast backup speeds, reliable physical backups, and efficient disk space usage. The recovery process involves complex log parsing and page flushing mechanisms, which can be optimized to improve performance, especially for large datasets.

Tencent Database Technology

May 9, 2018

Optimizing Xtrabackup Recovery Process for InnoDB Databases

Xtrabackup is an open-source hot backup tool developed by Percona for InnoDB and XtraDB databases. It provides non-blocking backups with several advantages, including fast backup speeds, reliable physical backups, and efficient disk space usage through compression. The backup process starts with a background detection process that records changes in the MySQL redo log, followed by copying InnoDB data files and system tablespace files. After flushing tables with read locks, it copies additional files and unlocks tables, stopping the background log.

The recovery process involves starting an embedded InnoDB instance to replay the Xtrabackup log, applying committed transaction information to the InnoDB data and tablespace, and rolling back uncommitted transactions. This process is similar to InnoDB instance recovery. Incremental backups are handled similarly to full backups but are relative to InnoDB, treating MyISAM and other storage engines as full backups.

The recovery process can be optimized in several ways. Log parsing can be improved by adding length information to log record headers, reducing the need for malloc and free operations. Additionally, introducing a metadata cache can decrease the number of malloc and free operations, improving performance. Parallel log parsing can further enhance speed by dividing the log into complete segments and processing them concurrently.

Page flushing during recovery can be optimized by writing dirty pages to the file cache without calling fsync, allowing the operating system to batch schedule these operations. This reduces the bottleneck caused by single-page evictions and improves recovery speed.

Parallel log parsing and replay can be achieved by treating log parsing as a producer and log replay as a consumer, with memory management adjusted to handle concurrent operations. This involves modifying InnoDB's internal mechanisms to support parallel processing without conflicts.

Testing and implementation of these optimizations show significant improvements in recovery times, especially for large datasets. For example, a 2TB instance with a 20GB log file saw recovery time reduce from 4 hours to 10 minutes, a 20-fold increase in speed.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Optimization InnoDB Database Recovery xtrabackup Database Management log parsing Backup Tools Page Flushing

Written by

Tencent Database Technology

Tencent's Database R&D team supports internal services such as WeChat Pay, WeChat Red Packets, Tencent Advertising, and Tencent Music, and provides external support on Tencent Cloud for TencentDB products like CynosDB, CDB, and TDSQL. This public account aims to promote and share professional database knowledge, growing together with database enthusiasts.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.