Databases 54 min read

Understanding Redis Data Skew and Hotkey Detection with JD Open‑Source hotkey Solution

This article explains the concept of Redis data skew, its causes and impacts, explores data volume and access skew classifications, presents mitigation strategies, and provides a comprehensive source‑code walkthrough of JD's open‑source hotkey framework—including client, worker, and dashboard components—for detecting and handling hot keys in distributed cache clusters.

JD Tech

Dec 1, 2022

Understanding Redis Data Skew and Hotkey Detection with JD Open‑Source hotkey Solution

The article introduces Redis data skew, defining it as the uneven distribution of cached data across nodes caused by poor load‑balancing, which can lead to increased latency, memory exhaustion, and node crashes.

It classifies data skew into volume skew (write‑skew) and access skew (read‑skew or hot‑key problems), describing scenarios such as big‑key accumulation, uneven slot allocation, and hash‑tag misuse that exacerbate the issue.

Mitigation methods are detailed, including avoiding large values in a single key, splitting big collections, rebalancing hash slots with CLUSTER SLOTS, CLUSTER SETSLOT, and CLUSTER GETKEYSINSLOT, and using hash tags wisely.

The hot‑key problem is examined, outlining causes like sudden traffic spikes on popular items, and its consequences such as cache overload and backend database overload.

Two primary solutions are presented: duplicating hot keys with random suffixes to spread load, and a local cache with dynamic detection that routes requests through an SLB to proxies which maintain per‑node LRU caches.

JD's open‑source hotkey framework is then dissected. The architecture consists of a client library, a worker cluster, and a dashboard. The client reports potential hot keys to workers via Netty, workers aggregate counts using lock‑free double‑buffered maps, and hot keys are pushed to both client local caches and the dashboard.

Key client components include ClientStarter (builder pattern), NettyKeyPusher, and the TurnKeyCollector / TurnCountCollector which use two ConcurrentHashMap instances and an AtomicLong to achieve lock‑free reads and writes.

Workers run scheduled tasks to push collected hot‑key batches, retry worker connections, and listen for rule changes via Etcd. They employ a producer‑consumer model with a LinkedBlockingQueue for hot‑key events and a sliding‑window algorithm to decide when a key becomes hot.

Configuration and coordination are handled through Etcd paths for rules, hot‑key entries, worker registration, and dashboard discovery. Periodic tasks keep worker liveness, upload client counts, and fetch dashboard IPs.

Code snippets illustrate the initialization of the client starter, the Netty pipeline setup, the hot‑key push logic, and the worker’s event handling, all wrapped in blocks to preserve formatting. In summary, the article provides both theoretical insight into Redis data skew and practical guidance through a fully‑featured hot‑key detection system, complete with architectural diagrams, implementation details, and best‑practice recommendations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Optimization HotKey distributed caching

Written by

JD Tech

Official JD technology sharing platform. All the cutting‑edge JD tech, innovative insights, and open‑source solutions you’re looking for, all in one place.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.