Using CRC32/MD5 Hashes and Prefix Indexes for Efficient Long String Indexing in MySQL
The article explains how to improve indexing of long string columns in MySQL by using generated columns with CRC32/MD5 hashes, functional indexes introduced in MySQL 8.0.13, and prefix indexes, providing code examples and performance considerations.
When indexing long string columns in MySQL, the index can become large and inefficient; a common solution is to store a CRC32 or MD5 hash of the string in a generated column and index that column.
create table website(
id int unsigned not null,
web varchar(100) not null,
webcrc int unsigned generated always as (crc32(web)) not null,
primary key (id)
);Inserting a row shows the automatically generated CRC value, which can be indexed to provide a compact and fast index.
insert into website(id,web) values (1,"https://www.scutech.com");
select * from website;From MySQL 8.0.13 onward, functional indexes can be created directly without a generated column.
create table website8(
id int unsigned not null,
web varchar(100) not null,
primary key (id),
index ((crc32(web)))
);Showing the indexes confirms the primary key and the functional index on crc32(web).
Another technique for long strings is a prefix index, where only the first N characters are indexed; the optimal prefix length balances selectivity and storage.
alter table sbtest2 add index (c(9));Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.