Databases 18 min read

Is Dragonfly Really the Fastest Redis-Compatible Cache? Benchmark Insights

This article examines the open‑source memory cache Dragonfly, its claim of being the world’s fastest Redis‑compatible system, the Redis team’s detailed response and benchmark methodology, and presents comprehensive performance comparisons that show Redis often outperforms Dragonfly across various workloads and configurations.

Architect's Tech Stack
Architect's Tech Stack
Architect's Tech Stack
Is Dragonfly Really the Fastest Redis-Compatible Cache? Benchmark Insights

Dragonfly: The Fastest Memory Cache?

Earlier this year a former Google and Amazon engineer released the open‑source memory data‑cache system Dragonfly , written in C/C++ and distributed under the Business Source License. Based on past benchmark results, Dragonfly is touted as possibly the world’s fastest in‑memory storage system, supporting both Memcached and Redis protocols while offering higher query performance and lower runtime memory consumption.

Compared with Redis, Dragonfly claims a 25× performance boost on typical workloads, can handle millions of requests per second on a single server, and uses 30% less memory in a 5 GB storage test. Within two months of release it earned 28 K GitHub stars and 1.1 K forks.

Redis Response

To counter Dragonfly’s emergence, Redis co‑founder and CTO Yiftach Shoolman, along with Redis Labs chief architect Yossi Gottlieb and performance engineer Filipe Oliveira, published an article titled “After 13 Years, Does Redis Need a New Architecture?”. They argue that Redis’s architecture remains the best for real‑time in‑memory data storage, despite acknowledging some of Dragonfly’s innovative ideas.

"It absolutely represents how ordinary users run Redis in the real world. Running a cluster on a single machine just to use more than one core adds unnecessary complexity; if a competitor works on any number of cores with a ‘just works’ approach, it should be easier to set up."
"Redis’s effort to write this article is a huge compliment to Dragonfly. I’m glad Redis published it, so I must explore Dragonfly—it looks impressive."

The Redis blog translation emphasizes that while new architectures periodically appear, Redis’s multi‑process, non‑shared design still offers the best performance, scalability, and elasticity for real‑time memory data platforms.

Performance Comparison

Initial Dragonfly benchmarks compared a single‑process Redis instance (single core) with a multi‑threaded Dragonfly instance (all available cores). To make the comparison fair, Redis was tested as a 40‑shard Redis 7.0 cluster on AWS c4gn.16xlarge instances, while Dragonfly used the maximum instance type reported by its authors.

In this adjusted test, Redis’s throughput exceeded Dragonfly’s by 18%–40% while utilizing only 40 of the 64 vCores.

Performance chart
Performance chart
img_2
img_2

Architecture Differences

Redis achieves horizontal scalability by running multiple processes (Redis Cluster) and, in Redis Enterprise, adds management, high availability, persistence, and backup. Running multiple Redis instances per VM provides linear vertical and horizontal scaling, faster replication, and rapid recovery from VM failures.

Redis limits each process to a reasonable size (≤25 GB, ≤50 GB with Redis on Flash) to keep fork overhead low and to simplify shard migration, rebalancing, and scaling.

"Run multiple Redis instances on each virtual machine."

Key design principles include avoiding shared state, leveraging multi‑process concurrency, and prioritizing horizontal scaling, which the authors argue is the most important factor for memory data stores.

Test Details

Version: Redis 7.0.0 built from source; Dragonfly built from the June 3 source snapshot.

Target: Verify Dragonfly’s published results and determine the best OSS Redis 7.0 performance on AWS c6gn.16xlarge.

Client configuration: OSS Redis required many open connections for cluster shards; the best results used two memtier_benchmark processes on the same client VM.

Results Overview

Single GET channel (latency < 1 ms)

OSS Redis: 4.43 M ops/s, average latency 0.383 ms.

Dragonfly claimed 4 M ops/s; our reproduction achieved 3.8 M ops/s, average latency 0.390 ms.

Redis outperformed Dragonfly by 10%–40%.

30‑parallel GET channel

OSS Redis: 22.9 M ops/s, avg latency 2.239 ms.

Dragonfly claimed 15 M ops/s; we reproduced 15.9 M ops/s, avg latency 3.99 ms.

Redis beat Dragonfly by 43%–52%.

Single SET channel (latency < 1 ms)

OSS Redis: 4.74 M ops/s, avg latency 0.391 ms.

Dragonfly claimed 4 M ops/s; we reproduced 4 M ops/s, avg latency 0.500 ms.

Redis outperformed Dragonfly by 19%.

30‑parallel SET channel

OSS Redis: 19.85 M ops/s, avg latency 2.879 ms.

Dragonfly claimed 10 M ops/s; we reproduced 14 M ops/s, avg latency 4.203 ms.

Redis beat Dragonfly by 42%–99%.

Benchmark Commands

Examples of memtier_benchmark commands used for the various test variants (GET/SET, single vs. 30‑parallel pipelines) are listed in the original article.

Test Environment

VM: AWS c6gn.16xlarge (aarch64, Arm Neoverse‑N1, 64 cores, 126 GB RAM, 1 NUMA node, Kernel 5.10).

performancearchitectureredisbenchmarkDragonflyin-memory cache
Architect's Tech Stack
Written by

Architect's Tech Stack

Java backend, microservices, distributed systems, containerized programming, and more.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.