Understanding MySQL Indexes: Types, Implementation, and Best Practices
This article explains what MySQL indexes are, the different categories such as ordinary, unique, composite, clustered and non‑clustered, how B‑Tree, B+Tree and hash indexes are implemented in InnoDB and MyISAM, and why auto‑increment primary keys are recommended for optimal performance.
What Is an Index?
An index is a data structure that helps MySQL retrieve rows efficiently.
What Can an Index Do?
It improves the speed of data queries.
1. Classification of Indexes
By storage structure: B‑Tree (B‑Tree or B+Tree), Hash, full‑text, and R‑Tree indexes.
By application level: ordinary index, unique index, composite index.
By physical vs. logical order: clustered index and non‑clustered index.
Ordinary index contains a single column; a table can have many of them. Unique index requires column values to be unique (NULLs are allowed). Composite index includes multiple columns.
Clustered index is not a separate type but a storage method; InnoDB stores the B+Tree and the row data together. Non‑clustered index is any index that is not clustered.
2. Underlying Implementation
InnoDB (the default engine) explicitly supports B‑Tree (technically B+Tree) indexes. For frequently accessed tables it creates an adaptive hash index on top of the B‑Tree, which is transparent to the client.
Hash Index
Implemented with a hash table; it is only useful for queries that match all indexed columns exactly. Each row’s indexed columns are hashed, and the hash code together with a pointer to the row is stored in the index.
B‑Tree / B+Tree Index
B‑Tree speeds up data access by avoiding full table scans; data are distributed across nodes. B+Tree, an improvement over B‑Tree, stores all data in leaf nodes and adds sequential pointers, allowing efficient range scans with only two leaf lookups.
In MySQL, B+Tree is the default index structure because it provides ordered access, balanced height, and low I/O cost for both point and range queries.
3. Engine‑Specific Implementations
MyISAM: Uses a two‑level index structure similar to secondary indexes.
InnoDB: Stores the primary key as a clustered B+Tree; secondary indexes contain the primary key as a pointer.
4. Common Questions
Why not use hash, binary tree, or red‑black tree as the default index? Hash indexes lack ordering and incur high I/O for range queries. Binary trees are not self‑balancing, leading to variable height and poor I/O performance. Red‑black trees grow in height with data size, also increasing I/O cost.
Why is an auto‑increment primary key recommended? Because it produces sequential values, minimizing page splits and data movement during inserts; each new row is appended to the end of the index.
Author & Source
Author: 浪人~ (Làngrén). Original article: cnblogs.com/liqiangchn/p/9060521.html
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.