Databases 26 min read

Overview of MySQL and InnoDB Storage Engine Architecture

This article provides a comprehensive overview of MySQL, detailing its configuration file search order, component architecture, various storage engines such as MyISAM, NDB, Memory, and an in‑depth examination of InnoDB’s internal structures, memory management, background threads, LRU handling, redo log buffering, and checkpoint mechanisms.

政采云技术
政采云技术
政采云技术
Overview of MySQL and InnoDB Storage Engine Architecture

MySQL Overview

When a MySQL instance starts, it reads configuration files in a defined order; if none are found, default compile‑time parameters are used. The command mysql --help | grep my.cnf shows the locations searched.

Typical configuration file paths examined in order are /etc/my.cnf , /etc/mysql/my.cnf , /usr/local/mysql/etc/my.cnf , and ~/my.cnf . If the same option appears in multiple files, the last one read takes precedence.

MySQL consists of several components: connection‑pool, management services and tools, SQL interface, query parser, optimizer, cache, pluggable storage engines, and physical files.

The most distinctive feature of MySQL is its pluggable storage‑engine architecture, where the engine is associated with a table rather than the whole database.

Pluggable Engines

MyISAM

MyISAM does not support transactions, uses table‑level locking, and supports full‑text indexes. Prior to MySQL 5.5.8 it was the default engine (except on Windows). Its cache stores only index files.

MyISAM tables consist of .MYD (data) and .MYI (index) files. The myisampack utility can compress data files using static Huffman coding, producing read‑only tables.

Before MySQL 5.0 the default maximum table size was 4 GB; from 5.0 onward the default limit is 256 TB.

NDB Cluster Engine

NDB implements a “share‑nothing” cluster architecture, providing high availability. Data resides primarily in memory (non‑index data can be stored on disk from MySQL 5.1), giving extremely fast primary‑key lookups. Adding NDB data nodes scales performance linearly.

Joins are performed in the MySQL server layer rather than in the engine, which can cause high network overhead and slower query performance for complex joins.

Memory Engine

Formerly called HEAP, the Memory engine stores table data in RAM; data is lost on server restart. It is suitable for temporary tables and dimension tables in data warehouses. It uses hash indexes by default.

Limitations include table‑level locking, poor concurrency, and lack of support for TEXT and BLOB columns. Variable‑length columns are stored as fixed‑length, wasting memory.

MySQL uses the Memory engine for internal temporary tables; if a temporary table exceeds the Memory engine’s capacity or contains TEXT/BLOB columns, it is automatically converted to a MyISAM table on disk.

Comparing Storage Engines

The SHOW ENGINES statement or the information_schema.ENGINES table can be used to list supported engines.

InnoDB Storage Engine

InnoDB Architecture

InnoDB maintains several memory pools that together form a large buffer pool, responsible for internal data structures, disk‑page caching, redo‑log buffering, and other tasks.

Background Threads

Master Thread

The Master Thread asynchronously flushes dirty pages, merges the insert buffer, recycles UNDO pages, and performs other housekeeping.

IO Thread

IO Threads handle asynchronous I/O callbacks. MySQL 5.0 and later provide configurable numbers of read and write IO threads via innodb_read_io_threads and innodb_write_io_threads .

Purge Thread

After a transaction commits, its UNDO log may be reclaimed by a Purge Thread. Since MySQL 1.1, purge work can run in dedicated threads, improving CPU utilization.

Enabling a dedicated purge thread can be done with:

[mysqld]
innodb_purge_threads=1

Page Cleaner Thread

Introduced in InnoDB 1.2.x, this thread isolates dirty‑page flushing from the Master Thread, reducing query‑thread blocking.

Memory Management

Buffer Pool

The buffer pool caches disk pages in RAM to bridge the CPU‑disk speed gap. Pages are “fixed” in the pool when read, and subsequent reads hit the pool if the page is present.

Modifications are made to pages in the buffer pool and later flushed to disk via a checkpoint mechanism. The pool size is set with innodb_buffer_pool_size ; an example shows a 15 GB pool.

Pages cached include index, data, UNDO, insert buffer, adaptive hash index, lock information, and dictionary data.

Multiple buffer‑pool instances can be created (default 1) and are configured with innodb_buffer_pool_instances .

LRU, Free, and Flush Lists

InnoDB uses an LRU algorithm with a midpoint insertion strategy to manage page recency. The innodb_old_blocks_pct variable controls the midpoint position (default 37%).

The innodb_old_blocks_time variable defines how long a page must stay after the midpoint before becoming “young”. Adjusting these values can reduce hot‑page eviction during large scans.

When the buffer pool is empty, pages are allocated from the Free list; otherwise, the LRU tail is evicted. Dirty pages reside in both the LRU and Flush lists; the Flush list tracks pages that must be written back.

Various SHOW ENGINE INNODB STATUS and information_schema tables (e.g., INNODB_BUFFER_POOL_STATS , INNODB_BUFFER_PAGE_LRU ) expose buffer‑pool statistics.

Compressed Pages and Unzip LRU

Since InnoDB 1.0, pages can be compressed to 1 KB, 2 KB, 4 KB, or 8 KB. Compressed pages are managed in an unzip_LRU list, with allocation performed via a buddy‑system algorithm.

Redo Log Buffer

The redo‑log buffer temporarily holds log records before they are flushed to the redo‑log files. Its size is controlled by innodb_log_buffer_size (default 8 MB) and is typically flushed every second, on transaction commit, or when the buffer is half full.

Additional Memory Pool

An extra memory pool supplies memory for internal control structures (e.g., buffer‑control blocks). Large buffer‑pool configurations may require increasing this pool.

Checkpoint Technology

Checkpoints periodically flush dirty pages from the buffer pool to disk, reducing recovery time after a crash. Two checkpoint types exist: Sharp (full flush on shutdown) and Fuzzy (partial flush during normal operation).

Fuzzy checkpoints are triggered by the Master Thread, the FLUSH_LRU_List, async/sync flush conditions, or when the dirty‑page percentage exceeds innodb_max_dirty_pages_pct (default 75%).

Parameters such as innodb_fast_shutdown , innodb_lru_scan_depth , and innodb_max_dirty_pages_pct control checkpoint behavior.

References

MySQL Technical Insights – InnoDB Storage Engine, 2nd Edition

Storage Enginedatabase architectureInnoDBMySQLCheckpointbuffer pool
政采云技术
Written by

政采云技术

ZCY Technology Team (Zero), based in Hangzhou, is a growth-oriented team passionate about technology and craftsmanship. With around 500 members, we are building comprehensive engineering, project management, and talent development systems. We are committed to innovation and creating a cloud service ecosystem for government and enterprise procurement. We look forward to your joining us.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.