How to Diagnose and Fix Redis Slow Execution: Practical Tips and Tools
This article explains how to detect Redis performance degradation, monitor latency and slow logs, and provides concrete optimization techniques such as network latency reduction, command tuning, RDB/AOF handling, swap management, expiration strategies, and big‑key mitigation to keep Redis fast and reliable.
1 Introduction
Redis is a crucial auxiliary component in many business systems, dramatically improving efficiency and reducing pressure on backend storage services. However, if Redis itself experiences request latency, it can cause a catastrophic cascade affecting the entire business chain. This article analyzes various reasons for Redis slowdown and offers detailed solutions.
2 Detecting and Monitoring Redis Slow Execution
2.1 How to Determine Redis Slowness
Benchmark tests show that with high‑end hardware (e.g., 8C 16G) Redis can handle up to 120,000 QPS with 0–10,000 connections, and still maintain 50,000 QPS with over 60,000 connections. When latency jumps from microseconds to several seconds, Redis performance has degraded and needs optimization.
2.1 Baseline Latency Test
The redis-cli option --intrinsic-latency measures the maximum latency over a specified period, providing a baseline that is unaffected by network factors. Example command:
<code>redis-cli --intrinsic-latency 120</code>Running this command reports the highest latency observed during the 120‑second window (e.g., 1665 µs ≈ 1.6 ms). To avoid network influence, test directly on the Redis server using -h host -p port .
2.2 Monitoring Slow Commands
Optimal algorithmic complexity is O(1) or O(log N). Redis strives for similar efficiency, but some commands (e.g., HGETALL , SMEMBERS , SORT , LREM , ZDELRANGEBYSCORE ) have O(N) complexity. Detect slow commands via tools like latency-monitor or by inspecting the Redis slow log.
2.3 Monitoring Slow Log
The SLOWLOG command records operations exceeding a configurable threshold (default 10 ms). Adjust the threshold with:
<code>redis-cli CONFIG SET slowlog-log-slower-than 3330</code>Example output shows command ID, timestamp, execution time (µs), and the command itself. Use SLOWLOG GET [count] to retrieve recent entries.
3 Solving Redis Slow Execution Issues
Redis processes all read/write operations on a single thread; blocking this thread severely degrades performance. The following sections address common causes.
3.1 Network Communication Latency
High network latency (e.g., across data centers) increases round‑trip time (RTT). Using TCP/IP or Unix domain sockets, a 1 Gbps link typically adds ~200 µs. Reducing RTT via pipelining (e.g., MGET , MSET ) can significantly improve throughput.
3.2 Slow Command‑Induced Latency
Replace high‑complexity commands with more efficient alternatives, avoid large single‑query data pulls, and use incremental scans ( SCAN , SSCAN , HSCAN , ZSCAN ). Disable the KEYS command in production.
3.3 Fork‑Generated RDB Delay
Generating RDB snapshots requires a forked process, which can block the main thread. Use copy‑on‑write (COW) wisely and keep primary instance size around 2–4 GB to limit load on replicas.
3.4 AOF File System or Large Memory Page Issues
Linux transparent huge pages (2 MB) can cause excessive memory copying during fork or AOF replay, slowing performance. Disable huge pages with:
<code>echo never > /sys/kernel/mm/transparent_hugepage/enabled</code>3.5 Operating System Swap Problems
Swap is used when physical memory is insufficient, dramatically slowing Redis due to disk I/O. Monitor swap usage via /proc/[pid]/smaps and mitigate by adding memory, isolating Redis on dedicated hosts, or scaling the cluster.
<code># Get Redis process ID
redis-cli info | grep process_id
# Navigate to /proc directory
cd /proc/12893
# Check swap usage
cat smaps | egrep '^(Swap|Size)'
</code>3.6 AOF Write‑Back Strategy "always"
When AOF is set to "always", every write is synchronously flushed to disk, severely impacting performance. Consider switching to "everysec" or "no" if occasional data loss is acceptable, or use RDB snapshots for less intrusive persistence.
Adjust AOF write‑back policy
Use RDB persistence
Upgrade hardware (e.g., faster disks)
Introduce additional caching layers
3.7 Expiration‑Based Eviction
Redis evicts expired keys either lazily (upon access) or periodically (every 100 ms). The frequency is controlled by the hz setting (default 10, i.e., every 100 ms). When a large proportion of keys expire simultaneously, Redis may spend excessive time deleting them, causing blocking. Add a random jitter to expiration times to spread deletions.
3.8 Optimizing Big Keys
Big keys (large strings, long lists, massive sorted sets or hashes) can cause OOM, memory imbalance in clusters, and long blocking deletions. Mitigation strategies include splitting big keys, using UNLINK for asynchronous deletion, choosing appropriate data structures, limiting key count, enabling compression, and adding random expiration offsets.
3.9 Other Causes
High‑complexity commands or full scans
Memory reaching maxmemory
Short‑lived client connections
Large RDB/AOF rewrite times
Insufficient host memory leading to swap
Improper CPU binding
Transparent huge pages enabled
Network interface overload
Inter‑instance or internal data transfer bottlenecks
Multi‑CPU/core architecture issues
SQL‑like blocking queries
4 Summary
Optimization steps:
Obtain Redis baseline latency
Enable slow‑command monitoring and analyze problematic commands
Activate slow‑log for timeout analysis
Address common Redis slow‑execution issues:
Architecture & Thinking
🍭 Frontline tech director and chief architect at top-tier companies 🥝 Years of deep experience in internet, e‑commerce, social, and finance sectors 🌾 Committed to publishing high‑quality articles covering core technologies of leading internet firms, application architecture, and AI breakthroughs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.