Backend Development 13 min read

Transparent Multilevel Cache (TMC): Architecture, Hotspot Detection, and Local Cache Implementation

The article introduces Transparent Multilevel Cache (TMC), a comprehensive caching solution that adds hotspot detection and local caching to existing distributed cache systems, explains its three‑layer architecture, transparent Java integration, real‑time hotspot discovery process, and demonstrates performance gains in high‑traffic e‑commerce scenarios.

IT Architects Alliance
IT Architects Alliance
IT Architects Alliance
Transparent Multilevel Cache (TMC): Architecture, Hotspot Detection, and Local Cache Implementation

TMC (Transparent Multilevel Cache) is a cache solution developed by Youzan PaaS to address hotspot access problems in large‑scale e‑commerce applications. It builds on existing distributed cache (e.g., CodisProxy + Redis, zanKV) and adds three key capabilities: application‑level hotspot detection, local cache, and cache‑hit statistics.

Why TMC is needed : During marketing events such as flash sales, a few hot keys generate massive cache traffic, overwhelming the distributed cache layer and affecting system stability. TMC automatically discovers these hotspots and pre‑places requests in a local cache at the application layer.

Architecture : TMC consists of three layers – Storage (kv stores like Codis, zanKV, Aerospike), Proxy (unified cache entry and routing), and Application (client SDK with hotspot detection and local cache). The article focuses on the Application layer.

Transparent Integration : Java services can use either spring.data.redis with RedisTemplate or youzan.framework.redis with RedisClient . Both ultimately create a Jedis instance via JedisPool . TMC modifies the native JedisPool and Jedis classes to interact first with Hermes‑SDK , which provides hotspot detection and local caching without code changes.

Application‑Layer Modules : Jedis‑Client : standard Jedis interface. Hermes‑SDK : implements hotspot detection and local cache. Hermes Server Cluster : receives access events, performs hotspot analysis, and pushes hot keys. Cache Cluster (proxy + storage). Infrastructure components (etcd, Apollo configuration).

Basic Workflow : When a key is requested, Jedis‑Client asks Hermes‑SDK if the key is hot. If hot, the value is returned from the local cache; otherwise the request goes to the cache cluster. Key updates (set/del/expire) trigger Hermes‑SDK.invalid() , which invalidates the local entry and broadcasts the event via etcd to other nodes. Hotspot discovery: Hermes‑SDK reports access events via rsyslog to Kafka . The server cluster consumes these events, maintains a 30‑second sliding window (10 time slices of 3 seconds each) per key, aggregates heat, stores the top‑N hot keys in Redis, and pushes the list back to the SDK. Configuration is read from Apollo (e.g., thresholds, black/white lists, etcd addresses).

Stability and Consistency : Asynchronous rsyslog reporting, isolated communication threads, and LRU‑limited local cache (≤64 MB) ensure low impact on business threads. Hot key updates invalidate local entries immediately (strong consistency) and propagate via etcd for eventual cluster consistency.

Hotspot Detection Process : Data collection: SDK sends key‑access events (appName, uniqueKey, sendTime, weight) to Kafka. Heat sliding window: each key has a 10‑slot time wheel, each slot counting accesses in a 3‑second interval, representing a 30‑second window. Heat aggregation: sum of slots yields total heat, stored in a sorted set in Redis. Hotspot detection: periodically select top‑N keys exceeding a threshold and push them to SDK.

Real‑World Results : In a Kuaishou live‑stream marketing event, TMC achieved ~80 % local cache hit rate, significantly reducing load on the cache cluster and improving request latency. Similar improvements were observed during Double‑11 promotions across multiple core services.

Future Outlook : TMC already serves product, logistics, inventory, marketing, user, and gateway services, with more applications being onboarded. Configuration flexibility (hotspot thresholds, black/white lists) allows fine‑tuning per business need.

distributed systemsJavaPerformanceCacheRedislocal cachehotspot detection
IT Architects Alliance
Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.