Databases 11 min read

Understanding MySQL Indexes: Types, B+Tree Structure, and Clustered vs. Non‑Clustered Indexes

This article explains what MySQL indexes are, how they work, their advantages and drawbacks, the different index types, the B+Tree storage mechanism compared with B‑Tree, and the distinction between clustered and non‑clustered (auxiliary) indexes, providing practical insights for database performance optimization.

Architect's Guide
Architect's Guide
Architect's Guide
Understanding MySQL Indexes: Types, B+Tree Structure, and Clustered vs. Non‑Clustered Indexes

MySQL indexes are data structures that improve query speed by allowing fast lookup of rows, similar to a book's table of contents.

When a query such as SELECT * FROM user WHERE id = 40 runs without an index, MySQL must scan the whole table; with an index, it can perform a binary search on the index and directly locate the row.

Indexes greatly accelerate data retrieval but incur costs: they consume disk space, require resources to maintain, and can slow down insert, update, and delete operations.

索引的优点:
1.大大加快数据的查询速度。
索引的缺点:
1.维护索引需要耗费数据库资源。
2.索引需要占用磁盘空间
3.当对表的数据增删改的时候,因为要维护索引,速度会受到影响。

MySQL provides several index categories:

1.主键索引
设定为主键后,数据库会自动为其创建主键索引,innodb为聚簇索引。
2.普通索引:用表中的普通列构建的索引,没有任何限制,用于加速查询。
3.组合索引:用多个列组合构建的索引,这多个列中的值不允许有空值。
4.全文索引(mysql5.7之前,由MyISAM提供):用大文本对象的列构建的索引,主要用来查找文件中的关键字。

The underlying storage mechanism for InnoDB indexes is a B+Tree. Each leaf page stores row data, while internal nodes store only key values, reducing tree depth and I/O operations compared with a classic B‑Tree where data resides in every node.

可以看到b-Tree上的每个节点都存储了数据,那么,我们刚刚说了,mysql一页的大小为16KB,那么这样的话,一页能存储的数据就很少了,因为数据要占用每页的字节呀。这样树的深度可能就深了。我们知道mysql每次读取数据时会进行一次IO操作,那么深度越深,IO的次数不是会越多。说白了优化优化,大多数都是在IO层做优化的。那么对比B+Tree,数据只存在叶子节点上,树的深度就不是那么深了(一般企业级不会超过3层深度)

In a B+Tree, only leaf nodes contain the actual rows, making range scans and primary‑key lookups very fast. InnoDB uses this structure for its clustered index, which stores the table data itself in the leaf pages.

Clustered (primary) indexes combine the index and the row data in the same B+Tree, while non‑clustered (auxiliary) indexes store only the indexed columns and a pointer to the primary key. To retrieve a row via an auxiliary index, MySQL first finds the primary key value, then looks up the row in the clustered index – a process known as a “covering” or “back‑table” lookup.

总之,其实说白了也就是,我们平常定义的索引就是辅助索引,平常通过普通索引查询数据时,先通过辅助索引查询到主键索引,再通过主键索引查询到具体的数据。

Choosing an auto‑incrementing integer as the primary key is recommended because it preserves insertion order, avoids page splits, and keeps I/O efficient.

performanceDatabaseMySQLIndexB-Treeclustered index
Architect's Guide
Written by

Architect's Guide

Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.