Using MySQL 8.0 Histograms to Improve Query Performance
MySQL 8.0's histogram feature provides column value distribution statistics that help the optimizer choose better execution plans, and the article demonstrates how to create, view, and leverage histograms to significantly reduce query costs for range predicates.
MySQL 8.0 introduces histograms (also called column statistics) that provide distribution information for column values, helping the optimizer choose better execution plans.
Histograms can be either equal‑width or equal‑height; MySQL automatically selects the type. They are stored in the meta table column_statistics and configured via the histogram_generation_max_mem_size parameter.
Limitations: not supported for geometry or JSON types, encrypted or temporary tables, columns with completely unique values, and require manual collection.
Example: a table t2 with columns rank1 , rank2 , rank3 , and log_date is cloned to t3 . After inserting data, histograms are created with:
ANALYZE TABLE t3 UPDATE HISTOGRAM ON rank1, log_date;
Querying the histogram data shows bucket boundaries and frequencies, e.g., JSON output from information_schema.column_statistics .
A sample query filtering on rank1 and log_date returns 2269 rows. The EXPLAIN plan without histogram shows a cost of 2796.11, while with histogram the cost drops to 0.71, demonstrating significant performance improvement.
Thus, using MySQL histograms can greatly reduce query cost for range predicates on columns with skewed distributions.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.