Understanding Redis Internal Data Structures, Hash Tables, and Progressive Rehash
This article explains why Redis is fast, describes its underlying data structures and how keys and values are stored using hash tables and hash buckets, discusses hash collisions and chain hashing, and details the progressive rehash mechanism that maintains performance as data grows.
When people hear Redis , they immediately think of speed and a single‑threaded model, but this article examines whether Redis truly lives up to that reputation by exploring its internal "slow actions".
Redis is an in‑memory database where a key can be read in microseconds, a speed largely attributed to its efficient data structures. Beyond the familiar String , List , Hash , Set , and Sorted Set , Redis actually implements several underlying structures, with only String having a single structure while the other types map to two structures each, forming what we call collection types where "one key corresponds to a collection of data".
To achieve fast key/value access, Redis stores keys in a hash table and values in hash buckets. A hash table is essentially an array whose elements are hash buckets; each bucket holds a pointer to the actual value. This design allows both simple strings and collection types to reference their values via pointers.
When many keys map to the same hash bucket, a hash collision occurs. Redis resolves collisions using chained hashing : multiple entries in the same bucket are linked together via a *next pointer, forming a hash‑collision chain. However, traversing a long chain can be slow.
To avoid the performance penalty of copying all entries during a resize, Redis employs a clever progressive rehash strategy. It maintains two global hash tables (hash table 1 and hash table 2). When the data volume grows, Redis allocates a larger hash table 2, gradually migrates entries from table 1 to table 2 a few at a time during normal requests, and finally switches to table 2, leaving table 1 ready for the next rehash.
If there are no incoming requests, a background timer still performs incremental rehash steps, ensuring the system never stalls.
For collection types, operation efficiency depends on the underlying structure. Redis uses five structures for collections: integer array, doubly linked list, hash table, ziplist (compressed list), and skiplist. Their time complexities are:
Data Structure
Time Complexity
Hash table
O(1)
Skiplist
O(log N)
Doubly linked list
O(N)
Ziplist
O(N)
Integer array
O(N)
Single‑element operations (e.g., HGET , HSET , SADD ) inherit the complexity of their underlying structure, often O(1) for hash‑based collections. Multi‑element commands such as HMGET or HMSET are O(N). Range operations like HGETALL or SMEMBERS are O(N) and can block the server, but Redis provides incremental SCAN families ( HSCAN , SSCAN , ZSCAN ) to mitigate this.
Statistical commands (e.g., LLEN , SCARD ) are O(1) because the collection structures keep element counts.
In summary, Redis achieves high performance not only because it operates entirely in memory but also due to its sophisticated internal data structures and mechanisms such as chained hashing and progressive rehash, which together ensure fast access while handling the inevitable "slow" cases gracefully.
Code Ape Tech Column
Former Ant Group P8 engineer, pure technologist, sharing full‑stack Java, job interview and career advice through a column. Site: java-family.cn
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.