Databases 12 min read

Unlock MySQL’s Pluggable Storage Engine: Architecture, APIs, and DIY Guide

This article explains MySQL’s logical architecture, details the storage‑engine API and its key components such as connectors, parsers, optimizer, caches, and file system interaction, and provides step‑by‑step guidance on creating, opening, locking, scanning tables and building a custom storage engine.

Raymond Ops
Raymond Ops
Raymond Ops
Unlock MySQL’s Pluggable Storage Engine: Architecture, APIs, and DIY Guide

Supporting multiple storage engines is a well‑known MySQL feature and a key architectural advantage. Understanding how MySQL Server interacts with storage engines via the API greatly helps grasp MySQL’s core infrastructure.

MySQL Logical Architecture

MySQL is a large network program and data‑management system with a complex architecture. The diagram below shows its logical layers.

Connectors

MySQL operates as a network program that defines its own application‑layer protocol on top of TCP. Clients can use native C API, JDBC, PHP connectors, ODBC, or other SDKs, which ultimately communicate with the server via the MySQL protocol.

Connection Management

Each TCP connection is bound to a thread; MySQL caches threads or uses a thread pool to avoid the overhead of frequent thread creation and destruction.

After a client connects, authentication based on username, host, and password is performed, optionally using SSL/TLS certificates.

SQL Interface

MySQL supports DML, DDL, stored procedures, views, triggers, and user‑defined functions.

Parser

The parser builds a syntax tree for each query, enriches it with metadata from the data dictionary, checks permissions, and performs query rewrite.

Optimizer

Based on the syntax tree and statistics, the optimizer decides table access order, chooses indexes, and generates an execution plan. The plan’s actual data operations are carried out through the pluggable storage‑engine API, independent of the engine implementation.

Caches & Buffers

MySQL maintains caches such as the Query Cache, which stores the result of a SELECT statement to avoid re‑parsing, optimizing, and executing the query again.

Pluggable Storage Engine

Storage engines implement the MySQL‑defined storage‑engine API, either partially or fully. MySQL can dynamically install or remove engines, run multiple engines simultaneously, and assign different engines to individual tables. Engines manage table data, indexes, runtime caches, buffers, transactions, and logs on top of the file system.

MySQL 5.7.11 supports the following engines by default:

<code>+--------------------+---------+----------------------------------------------------------------+--------------+------+------------+
| Engine             | Support | Comment                                                        | Transactions | XA   | Savepoints |
+--------------------+---------+----------------------------------------------------------------+--------------+------+------------+
| InnoDB             | DEFAULT | Supports transactions, row-level locking, and foreign keys   | YES          | YES  | YES        |
| MRG_MYISAM         | YES     | Collection of identical MyISAM tables                         | NO           | NO   | NO         |
| MEMORY             | YES     | Hash based, stored in memory, useful for temporary tables      | NO           | NO   | NO         |
| BLACKHOLE          | YES     | /dev/null storage engine (anything you write to it disappears) | NO           | NO   | NO         |
| MyISAM             | YES     | MyISAM storage engine                                          | NO           | NO   | NO         |
| CSV                | YES     | CSV storage engine                                            | NO           | NO   | NO         |
| ARCHIVE            | YES     | Archive storage engine                                        | NO           | NO   | NO         |
| PERFORMANCE_SCHEMA | YES     | Performance Schema                                            | NO           | NO   | NO         |
| FEDERATED          | NO      | Federated MySQL storage engine                                 | NULL         | NULL | NULL       |
+--------------------+---------+----------------------------------------------------------------+--------------+------+------------+
</code>

File System

All data—databases, table definitions, rows, indexes—are stored as files in the underlying file system. Some engines like InnoDB can operate directly on raw devices, but modern file systems make this unnecessary. Under the file system, local disks, DAS, NAS, SAN, etc., can be used.

Storage Engine API

MySQL defines a set of storage‑engine APIs that enable a plugin architecture. The API consists of virtual functions in the

handler

class, described in

sql/handler.h

.

<code>/**
  The handler class is the interface for dynamically loadable
  storage engines. Do not add ifdefs and take care when adding or
  changing virtual functions to avoid vtable confusion

  Functions in this class accept and return table columns data. Two data
  representation formats are used:
  1. TableRecordFormat - Used to pass [partial] table records to/from
     storage engine

  2. KeyTupleFormat - used to pass index search tuples (aka "keys") to
     storage engine. See opt_range.cc for description of this format.

  TableRecordFormat
  =================
  [Warning: this description is work in progress and may be incomplete]
  The table record is stored in a fixed-size buffer:

      record: null_bytes, column1_data, column2_data, ...

  // (content omitted for brevity)
*/
class handler : public Sql_alloc {
    // (implementation omitted; see ./sql/handler.h for details)
}
</code>

The following sections describe selected API groups.

Create, Open, and Close Tables

Creating a table uses the

create

function:

<code>int create(const char *name, TABLE *form, HA_CREATE_INFO *info);
</code>

Opening a table uses

open

:

<code>int open(const char *name, int mode, int test_if_locked);
</code>

Closing a table uses

close

:

<code>int close(void);
</code>

Table Locking

When a client issues

LOCK TABLE

, the storage engine locks the table via

external_lock

:

<code>int ha_example::external_lock(THD *thd, int lock_type);
</code>

Full Table Scan

Initialize a full scan with

rnd_init

and fetch the next row with

rnd_next

:

<code>virtual int rnd_init(bool scan);
virtual int rnd_next(byte *buf);
</code>

Index Access

Before using an index, call

index_init

; after finishing, call

index_end

. The engine provides functions to read the first, next, previous, last rows, and to read a specific key:

<code>int index_init(uint keynr, bool sorted);
int index_end(uint keynr, bool sorted);
int ha_index_first(uchar *buf);
int ha_index_next(uchar *buf);
int ha_index_prev(uchar *buf);
int ha_index_last(uchar *buf);
int index_read(uchar *buf, const uchar *key, uint key_len, enum ha_rkey_function find_flag);
</code>

Transaction Handling

Key transaction functions are

start_stmt

,

rollback

, and

commit

:

<code>int my_handler::start_stmt(THD *thd, thr_lock_type lock_type);
int (*rollback)(THD *thd, bool all);
int (*commit)(THD *thd, bool all);
</code>

How to Write Your Own Storage Engine

The official MySQL documentation provides a guide for implementing a custom storage engine. As a starting point, you can copy the source of the example engine (

ha_example

) and rename it:

<code>sed -e s/EXAMPLE/FOO/g -e s/example/foo/g ha_example.h > ha_foo.h
sed -e s/EXAMPLE/FOO/g -e s/example/foo/g ha_example.cc > ha_foo.cc
</code>
Storage EnginepluginDatabase ArchitectureMySQLAPI
Raymond Ops
Written by

Raymond Ops

Linux ops automation, cloud-native, Kubernetes, SRE, DevOps, Python, Golang and related tech discussions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.