MySQL Index Storage: Clustered vs Secondary Indexes in MyISAM and InnoDB
This article explains how MySQL stores indexes for the MyISAM and InnoDB engines, comparing clustered (primary) indexes and secondary indexes, illustrating their B+‑tree structures, showing SQL examples, and discussing the advantages and disadvantages of each storage method for database optimization.
The previous chapter introduced B+‑tree index organization and why most databases use it for disk storage, range scans, and efficient lookups. This article focuses on the two common MySQL storage engines—MyISAM and InnoDB—and compares their index storage methods.
Clustered Index (Primary Index) – The physical order of table rows matches the logical order of the index. Only one clustered index can exist per table. In MySQL, only InnoDB tables support clustered indexes; the table itself is an index‑organized table (IOT) where leaf nodes store the primary key and the full row data, enabling fast full‑table scans in primary‑key order.
Secondary Index (Non‑Clustered Index) – Leaf nodes store the indexed column value plus the primary‑key value. All InnoDB tables have secondary indexes for columns other than the primary key, while MyISAM, MEMORY, etc., store only non‑clustered indexes. In MyISAM, indexes and row data are stored separately.
MyISAM Example
Assume a MyISAM table t1 with columns ID (primary key), name, gender, age, phone_number, and an index on age . The primary‑key index and the age secondary index are both stored as separate B+‑trees.
Because MyISAM stores indexes separately from rows, both primary‑key and secondary‑index lookups require an extra sorting step.
Example without explicit ordering (SQL 1) returns rows in insertion order, while ordering by id (SQL 2) forces a sort:
mysql
# SQL 1
mysql> select * from t1;
+-------+----------+--------+------+--------------+
| id | username | gender | age | phone_number |
+-------+----------+--------+------+--------------+
| 10001 | 小花 | 女 | 18 | 18501877098 |
| 10005 | 小李 | 女 | 21 | 15827654555 |
| 10006 | 小白 | 男 | 38 | 19929933000 |
| 10009 | 小何 | 男 | 35 | 19012378676 |
| 10002 | 小王 | 男 | 20 | 17760500293 |
| 10003 | 小赵 | 女 | 29 | 13581386000 |
| 10004 | 小青 | 女 | 25 | 13456712000 |
| 10007 | 小米 | 男 | 23 | 19800092354 |
| 10008 | 小徐 | 女 | 22 | 18953209331 |
+-------+----------+--------+------+--------------+
9 rows in set (0.00 sec)
# SQL 2
mysql> select * from t1 order by id;
+-------+----------+--------+------+--------------+
| id | username | gender | age | phone_number |
+-------+----------+--------+------+--------------+
| 10001 | 小花 | 女 | 18 | 18501877098 |
| 10002 | 小王 | 男 | 20 | 17760500293 |
| 10003 | 小赵 | 女 | 29 | 13581386000 |
| 10004 | 小青 | 女 | 25 | 13456712000 |
| 10005 | 小李 | 女 | 21 | 15827654555 |
| 10006 | 小白 | 男 | 38 | 19929933000 |
| 10007 | 小米 | 男 | 23 | 19800092354 |
| 10008 | 小徐 | 女 | 22 | 18953209331 |
| 10009 | 小何 | 男 | 35 | 19012378676 |
+-------+----------+--------+------+--------------+
9 rows in set (0.00 sec)InnoDB Indexes
InnoDB tables are index‑organized; the clustered index stores the full row data. Leaf nodes contain the primary‑key value and the row, while non‑leaf nodes store only the primary‑key values. This design yields two main benefits:
Rows are stored in primary‑key order, eliminating the need for an extra sort when scanning the table.
Leaf pages contain pointers to the previous and next leaf, making page splits and inserts cheap.
Example of a full table scan on InnoDB (rows already appear in primary‑key order):
mysql
mysql> select * from t1;
+-------+----------+--------+------+--------------+
| id | username | gender | age | phone_number |
+-------+----------+--------+------+--------------+
| 10001 | 小花 | 女 | 18 | 18501877098 |
| 10002 | 小王 | 男 | 20 | 17760500293 |
| 10003 | 小赵 | 女 | 29 | 13581386000 |
| 10004 | 小青 | 女 | 25 | 13456712000 |
| 10005 | 小李 | 女 | 21 | 15827654555 |
| 10006 | 小白 | 男 | 38 | 19929933000 |
| 10007 | 小米 | 男 | 23 | 19800092354 |
| 10008 | 小徐 | 女 | 22 | 18953209331 |
| 10009 | 小何 | 男 | 35 | 19012378676 |
+-------+----------+--------+------+--------------+
9 rows in set (0.00 sec)InnoDB secondary indexes store the indexed column value plus the primary‑key value. The non‑leaf nodes hold the indexed column (e.g., age ), and leaf nodes hold age and the corresponding primary‑key.
Advantages of this design include reduced maintenance when rows move or pages split, because only the clustered index needs updating.
Drawbacks:
The secondary index size grows because it also stores the primary‑key value, which can be problematic with wide or UUID primary keys.
Retrieving a row via a secondary index requires two lookups: first the secondary index to get the primary key, then the clustered index to fetch the row (a “back‑lookup”).
Example query to find rows where age = 23 :
mysql
select * from t1 where age = 23;This is executed as two steps:
mysql
select id from t1 where age=23; -- finds primary key 10005
mysql
select * from t1 where id = 10005; -- fetches the full row (may require a back‑lookup)MySQL optimizes this process with data pre‑warming and other internal mechanisms.
In summary, MyISAM uses separate non‑clustered indexes that require extra sorting, while InnoDB’s clustered index stores rows in primary‑key order and secondary indexes provide efficient lookups at the cost of additional storage and a potential double‑lookup.
The next article will discuss best practices for designing primary keys in MySQL.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.