Understanding MySQL Storage Engines and Index Types (B‑Tree, B+Tree)
This article explains MySQL storage engines, compares MyISAM and InnoDB, introduces the main index types—B‑Tree, Hash, Full‑text, R‑Tree—details the structures and characteristics of B‑Tree, B‑Tree and B+Tree, and outlines key principles for creating effective indexes.
1. Comparison of Storage Engines
MySQL offers several storage engines; the default since MySQL 5.5.5 is InnoDB, while older versions used MyISAM. Different engines have distinct features and performance characteristics.
Note: The B‑tree index mentioned above does not specify whether it is a B‑Tree or B+Tree, which have different definitions.
MySQL supports four main index types: B‑Tree, Hash, Full‑text, and R‑Tree. B‑Tree indexes are the most widely used, supported by all storage engines except Archive (which added limited support in MySQL 5.1).
B‑Tree indexes store keys in a balanced tree structure, ensuring that all leaf nodes are at the same depth and that the shortest path to any leaf node has the same length.
InnoDB actually implements B+Tree for its primary and secondary indexes, adding pointers between leaf nodes to speed up sequential access.
2. Concepts of B‑Tree and B+Tree
B‑Tree
A multi‑way search tree where each non‑leaf node can have up to M children (M > 2). Key properties include:
All keys are stored in the internal nodes.
Non‑leaf nodes contain M/2‑1 to M‑1 keys and M pointers.
All leaf nodes reside at the same level.
Search proceeds by binary searching the sorted keys within a node and descending to the appropriate child pointer.
Characteristics of B‑Tree:
Keys are distributed throughout the tree.
Each key appears in exactly one node.
Search may finish at a non‑leaf node.
Performance is equivalent to a binary search over the entire key set.
Automatic level control ensures balanced height.
B+Tree
B+Tree is a variant of B‑Tree with the following differences:
Non‑leaf nodes have the same number of pointers as keys.
Pointers in non‑leaf nodes point to ranges [K[i], K[i+1]).
All leaf nodes are linked together in a sequential chain.
All keys appear only in leaf nodes (dense index).
Search in a B+Tree always reaches a leaf node, making its performance also equivalent to a binary search over the full key set.
Characteristics of B+Tree:
All keys are stored in the leaf‑node linked list in sorted order.
Non‑leaf nodes act as a sparse index.
Better suited for file‑system indexing.
3. Key Principles for Building Indexes
1. Left‑most prefix rule : MySQL can use an index only up to the first range condition (>, <, BETWEEN, LIKE). Ordering of columns matters for optimal use.
2. Equality and IN conditions can be reordered; MySQL’s optimizer will rearrange them to match the index.
3. High cardinality columns should be chosen as index columns; the cardinality is calculated as count(distinct)/count(*). Values above 0.1 are generally good for join columns.
4. Avoid functions on indexed columns : Expressions like from_unixtime(create_time) prevent index usage because the stored values must be transformed before comparison.
5. Prefer extending existing indexes over creating new ones; for example, modify an existing index on column a to include column b instead of adding a separate (a,b) index.
Source: blog.csdn.net/u013142781/article/details/51706790
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.