Databases 19 min read

Master MySQL Indexes: BTREE, Hash, and Query Optimization Secrets

This article explains MySQL's indexing mechanisms—including BTREE and hash structures, page layout, index types, and practical query patterns—while showing how to interpret EXPLAIN output and avoid common pitfalls for efficient database performance.

ITPUB

Dec 21, 2015

Master MySQL Indexes: BTREE, Hash, and Query Optimization Secrets

MySQL is a free, high‑performance relational database that relies on various optimization techniques such as row‑level locking, transaction isolation, MVCC, and robust storage engines; however, effective use of indexes remains essential for fast query execution.

MySQL Query Execution Flow

The client sends a SQL statement to the server, which first checks the query cache. If the cache misses, the server parses the statement, performs preprocessing, and runs the optimizer to decide whether an index (e.g., BTREE) can be used. It then generates a query plan, invokes the storage engine API, retrieves data, and returns the result.

BTREE Index Structure

MySQL’s default index type is BTREE (implemented as B+‑tree). Nodes are stored in logical pages (typically 16 KB). Each page contains ordered keys (key1 < key2 < … < keyN) and pointers to child pages, enabling range scans and binary search.

In InnoDB the leaf nodes store the actual row data, so the index is a clustered index. The val field in leaf nodes points directly to the row, unlike MyISAM where the index stores a pointer to the data file.

Pages and Page Splits

Each page is a fixed‑size block (16 KB in InnoDB). When a page fills to about 15/16 of its capacity, MySQL creates a new page—a process called a page split. Frequent splits degrade performance, especially for non‑auto‑increment keys (e.g., UUIDs or MD5 strings) that insert rows out of order.

Index Types in InnoDB

Primary key index – automatically created, clustered, unique, and non‑null.

Secondary (auxiliary) index – stores the indexed column(s) plus the primary key value; during a lookup MySQL first finds the secondary index entry, then uses the stored primary key to fetch the row.

Composite index – multiple columns; order matters and only the leftmost prefix can be used for range scans.

Prefix index – indexes only the first N bytes of a column, useful for long VARCHAR/TEXT columns to save space.

Hash index – stores hash values of the indexed columns; works only for equality/IN queries and cannot support ORDER BY.

How Queries Use Indexes

Full‑value match : All indexed columns are specified, e.g.,

SELECT * FROM staffs WHERE name='July' AND age='23' AND pos='dev';

Left‑most prefix : Only the first columns of a composite index are used, e.g., SELECT * FROM staffs WHERE name='July' AND age='23'; Column‑prefix (LIKE 'J%') : Index can be used for left‑anchored wildcards; MySQL cannot use a B‑tree for patterns like LIKE '%y'.

Range queries : e.g., SELECT * FROM staffs WHERE name > 'Mary'; (works for numeric or timestamp columns).

Mixed exact + range : e.g., SELECT * FROM staffs WHERE name='July' AND age>25; Covering index : All columns needed by the SELECT are present in the index, so MySQL can satisfy the query using only the index (EXPLAIN shows Using index).

Prefix index : Indexes a fixed number of leading characters; its usefulness depends on the column’s selectivity (unique‑value ratio).

Images below illustrate some of these cases:

Index Limitations and Bad Practices

Skipping columns in a composite index (e.g., WHERE a=1 AND c=3) prevents index use.

Range condition on a leftmost column stops use of subsequent columns.

Reordering conditions (e.g., WHERE c=3 AND b=2 AND a=1) defeats the left‑most rule.

Applying functions or expressions to indexed columns (e.g., WHERE SUBSTR(a,1,3)='hhh') disables index usage.

For LIKE patterns, place the column on the left side and avoid leading wildcards.

ORDER BY and Index Interaction

ORDER BY can use an index only when the sort columns follow the left‑most prefix order and have the same direction. Queries that order by a non‑leftmost column, mix ASC/DESC, or order by columns not in the index cannot use the index.

Clustered vs. Covering vs. Hash Indexes

InnoDB’s primary key is a clustered index: the leaf nodes store the full row, so there can be only one per table. A covering index contains all columns required by the query, allowing MySQL to return results directly from the index without touching the table. Hash indexes store only hash values and support only equality comparisons.

Understanding EXPLAIN’s key_len

The key_len column shows the total byte length of the index parts used by the query. It depends on column data types (e.g., INT = 4 bytes + 1 byte length prefix, CHAR = 3 bytes + 1, VARCHAR = variable). By creating test tables and running EXPLAIN, you can see how MySQL reports these lengths.

Conclusion

Effective MySQL indexing requires understanding BTREE page layout, page splits, the distinction between primary, secondary, composite, prefix, and hash indexes, and how queries map to these structures. Using EXPLAIN, monitoring key_len, and avoiding common pitfalls (function calls, out‑of‑order columns, leading wildcards) will help you achieve high‑performance data access.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

indexing Database query optimization mysql Hash Index BTree

Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.