Backend Development 14 min read

How a Faster CRC-64 Boosted Redis Performance: The CRCSpeed Story

An in‑depth look at how mattsta’s CRCSpeed implementation replaced Redis’s original CRC algorithm, delivering up to four‑fold speed gains, the history of its development from 2014 to its 2020 integration, and the performance impact on RDB generation and cluster slot hashing.

macrozheng
macrozheng
macrozheng
How a Faster CRC-64 Boosted Redis Performance: The CRCSpeed Story

Showtime

I discovered an old 2014 article by mattsta claiming that Redis’s CRC implementation was overly simplistic and that a faster version, CRCSpeed, could improve performance fourfold.

Checking Redis 5.0 source showed the original CRC was still used, but Redis 6.0 (commit on 2020‑04‑28 by antirez) adopted the CRCSpeed implementation.

Fancy CRCing You Here

The article’s title,

Fancy CRCing You Here

, hints at a deep dive into CRC improvements.

Many projects have copied Redis’s CRC‑64 and CRC‑16 implementations, yet the author felt they deserved a better algorithm.

What’s Wrong

CRC is inherently sequential, limiting parallelism. In Redis, CRC‑64 is used in three places: during cross‑instance key migration (with checksum verification), as an optional checksum for RDB output (used for replication and persistence), and for memory testing. CRC‑16 is used as the hash function for assigning cluster slots.

What’s Better

Mattsta compared Redis’s CRC‑64 with a high‑speed version from a StackOverflow user named Mark, finding Mark’s implementation 400% faster (1.6 GB/s vs 400 MB/s). However, CRC‑64 has many variants, so direct substitution isn’t trivial.

What’s Improved

Redis’s original CRC loops byte‑by‑byte with a lookup table. The fast version processes eight bytes at a time using a “slicing‑by‑8” technique (an Intel 2006 method) that employs eight parallel lookup tables, reducing the number of iterations dramatically.

For a 500 MB input, the original needs 500 million loops, while the fast version needs only 62.5 million.

Result

After a year of work, mattsta produced a CRCSpeed implementation that matches Redis’s CRC‑64 API, runs faster, and can also be used for CRC‑16, eliminating large static lookup tables in the source.

Real‑World Impact

Faster CRC‑64 speeds up RDB generation, reducing the time the forked child process runs and thus lowering copy‑on‑write memory usage. A quicker CRC‑16 can also cut cluster slot allocation overhead when handling large keys.

Minor Notes

Mattsta didn’t want to reinvent the wheel, but lacking one, he created a new “wheel” – the fast CRC implementation.

Resources Consulted

The author thanks “A Painless Guide to CRC Error Detection Algorithms” and other references.

Tracking the Timeline

The article was first written on 2014‑12‑22. Mattsta began contributing to redis.io in Dec 2013, opened an issue on 2014‑04‑01, created the CRCSpeed library on 2014‑11‑23, and submitted a PR on the same day as the article. After months of discussion, antirez responded on 2015‑02‑25, requesting reproducible performance tests before merging.

Looking Back

Mattsta’s effort shows a thorough investigation from issue filing to PR submission, highlighting the challenges of getting performance‑critical changes merged into a core project.

Performance OptimizationalgorithmBackend DevelopmentredisCRC
macrozheng
Written by

macrozheng

Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.