Transparent Multilevel Cache (TMC): Architecture, Hotspot Detection, and Local Cache for High‑Performance Applications
The article introduces Transparent Multilevel Cache (TMC), a Youzan PaaS solution that adds application‑level hotspot detection, local caching, and hit‑rate statistics to distributed caches, explains its architecture, workflow, consistency guarantees, and shows performance improvements during high‑traffic events.
What is TMC
Transparent Multilevel Cache (TMC) is a comprehensive caching solution developed by Youzan PaaS team to provide application‑level hotspot detection, local caching and hit‑rate statistics on top of a generic distributed cache such as CodisProxy+Redis or the self‑developed zanKV.
Why TMC is needed
During marketing activities like flash sales, a small number of hot keys generate massive request traffic that can overwhelm the distributed cache and network bandwidth, threatening service stability. TMC automatically discovers these hotspots and pre‑places requests in an application‑level local cache.
Problems with traditional multilevel cache
Key challenges include hotspot detection, data consistency, effectiveness verification, and transparent integration.
Overall Architecture
The architecture consists of three layers: Storage layer (kv stores such as Codis, zanKV, Aerospike), Proxy layer (unified cache entry and routing), and Application layer (client library with built‑in hotspot detection and local cache).
Application‑layer Local Cache
Transparent Integration
Java services can use either spring.data.redis (RedisTemplate) or youzan.framework.redis (RedisClient). Regardless of the choice, JedisPool creates a Jedis object that talks to the proxy layer. TMC modifies JedisPool to initialise the Hermes‑SDK, which performs hotspot detection and local caching before contacting the cache cluster, requiring only a specific jedis‑jar version and no code changes.
Module Breakdown
Jedis-Client : entry point for Java applications, API identical to native Jedis.
Hermes‑SDK : SDK that implements hotspot detection and local caching.
Hermes Server Cluster : collects access events, performs hotspot detection, and pushes hot keys to SDKs.
Cache Cluster : distributed cache service composed of proxy and storage layers.
Infrastructure : etcd cluster and Apollo configuration center for push and unified configuration.
Basic Workflow
Key Retrieval
Jedis‑Client asks Hermes‑SDK whether a key is hot.
If hot, value is returned from the local cache.
If not hot, the request is forwarded to the cache cluster.
Each access event is asynchronously reported to Hermes‑Server for hotspot analysis.
Key Expiration
When set/del/expire is called, Jedis‑Client notifies Hermes‑SDK.
For hot keys, the local cache entry is invalidated immediately and the invalidation event is broadcast via etcd to other SDK instances.
Hotspot Discovery
Hermes‑Server collects events, maintains a sliding‑window heat map (10 slots of 3 s each) per key.
Every 3 s a mapping task aggregates heat, stores Map<appName,Map<uniqueKey,heat>> in memory, and writes aggregated scores to Redis.
Top‑N hot keys exceeding a threshold are pushed to SDKs.
Configuration
Both SDK and server read runtime parameters (feature toggles, black/white lists, etcd address, thresholds) from Apollo.
Stability and Consistency
Asynchronous reporting via rsyslog + Kafka prevents blocking.
Communication module runs in an isolated thread pool.
Local cache size is limited to 64 MB with LRU eviction.
Only hot keys are cached locally; non‑hot keys remain in the cache cluster, ensuring strong consistency for hot keys and eventual consistency across nodes.
Performance Results
During a fast‑selling event on Kuaishou, TMC raised local‑cache hit rate to ~80 % and reduced request latency while handling increased QPS. Additional charts from Double‑11 show similar improvements across product, activity, and other core services.
Future Outlook
TMC is already serving product, logistics, inventory, marketing, user, gateway and messaging services, with more applications being onboarded. Users can tune hotspot thresholds, hot‑key count, and black/white lists to fit their workloads.
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.