Why Is Redis Slowing Down? Diagnose and Fix Common Latency Issues
This article explains the typical reasons behind Redis latency spikes—such as complex commands, big keys, concentrated expirations, memory limits, fork overhead, CPU binding, AOF settings, swap usage, and network overload—and provides practical steps and monitoring techniques to identify and resolve each problem.
Redis, as an in‑memory database, can handle around 100 k QPS per instance, but latency can suddenly increase if its internal mechanisms are not understood.
High‑Complexity Commands
When latency spikes, first check the slow‑log. Set the threshold and length:
<code># Record commands slower than 5 ms
CONFIG SET slowlog-log-slower-than 5000
# Keep the latest 1000 entries
CONFIG SET slowlog-max-len 1000</code>Query recent entries:
<code>SLOWLOG get 5</code>Commands with O(N) complexity (e.g.,
SORT,
SUNION,
ZUNIONSTORE) or large data volumes can cause high CPU usage and latency.
Bigkey Storage
If slow‑log shows only simple
SETor
DELcommands, investigate big keys. Scan for them with:
<code>redis-cli -h $host -p $port --bigkeys -i 0.01</code>Scanning may temporarily increase QPS; control the interval with
-i. Redis 4.0+ offers lazy‑free to free big keys asynchronously, but avoiding big keys is still recommended.
Concentrated Expiration
Mass expiration at a fixed time can cause latency spikes that are not recorded in the slow‑log because expiration runs before command execution.
Active expiration: a timer removes up to 20 random keys every 100 ms, stopping when expired‑key ratio drops below 25 % or the task exceeds 25 ms.
Lazy expiration: keys are removed only when accessed.
To mitigate, randomize expiration times (e.g., add a small offset) and monitor the
expired_keysmetric from
INFOfor sudden jumps.
Instance Memory Limit
When
maxmemoryis reached, Redis evicts keys according to the configured policy, which can add latency. Common policies include:
allkeys‑lru
volatile‑lru
allkeys‑random
volatile‑random
allkeys‑ttl
noeviction
allkeys‑lfu (Redis 4.0+)
volatile‑lfu (Redis 4.0+)
Choosing a policy depends on the workload;
allkeys‑lruor
volatile‑lruare often preferred.
Fork Overhead
RDB/AOF persistence and replication trigger a fork. Copy‑on‑write of large memory pages can block the main thread for seconds, especially on virtual machines. Check the last fork duration with:
<code>INFO</code>Look at
latest_fork_usec. Schedule backups during low‑traffic periods, consider disabling AOF if data loss is acceptable, and run persistence on replica nodes.
CPU Binding
Binding the Redis process to specific CPUs can cause contention with the forked child during persistence, increasing latency. Avoid CPU pinning when using RDB/AOF.
Improper AOF Configuration
AOF offers three fsync policies:
appendfsync always: highest durability, highest latency.
appendfsync everysec: balances durability and performance (recommended).
appendfsync no: relies on OS, lowest durability.
For most workloads,
everysecis the best compromise.
Swap Usage
If the host swaps, Redis latency can reach hundreds of milliseconds. Monitor memory and swap, and when swap is used, free memory or restart instances after a failover.
Network Card Overload
High network traffic can cause packet loss and increased latency. Monitor NIC utilization, set alerts, and scale or migrate instances when bandwidth is saturated.
Summary
The article outlines common Redis latency sources—from command complexity and big keys to memory limits, eviction policies, fork overhead, CPU binding, AOF settings, swap, and network saturation—and provides diagnostic commands and best‑practice recommendations for developers and DBAs.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.