Fundamentals 11 min read

Consistent Hashing Algorithm: Principles, Advantages, and Applications

Consistent hashing maps both servers and keys onto a virtual hash ring, allowing keys to be assigned to the nearest clockwise server, which minimizes data movement during node addition or removal, improves load distribution with virtual nodes, and is widely used in distributed caching and load balancing systems.

IT Services Circle

Mar 25, 2022

Consistent Hashing Algorithm: Principles, Advantages, and Applications

Scenario Construction

Assume three cache servers named node0, node1, and node2 and 30 million key objects that need to be evenly distributed across the three machines.

The simplest solution is the modulo algorithm hash(key) % N, where N is the number of servers. This directly maps each key to one of the three nodes, but it suffers when the number of servers changes.

Problems with Simple Modulo

When a server is added or removed, the expression hash(key) % N yields different results, causing many keys to be remapped. For example, if a server fails and N changes from 3 to 2, most cached keys would need to be relocated, leading to cache avalanche and service disruption.

Consistent Hashing

Consistent hashing also uses a modulo operation, but it takes the modulo of a fixed space of 2^32 instead of the number of servers.

“IPv4 addresses consist of 4 groups of 8‑bit binary numbers, so using 2^32 guarantees a unique mapping for each IP address.”

The 2^32 values are imagined as points on a circular hash ring. Each server’s IP address is hashed and placed on the ring, and each key is also hashed onto the same ring.

Mapping Servers to the Hash Ring

The servers node0, node1, and node2 are each hashed (e.g., hash(server_ip) % 2^32) and placed on the ring.

Mapping Keys to the Hash Ring

Each key is hashed with hash(key) % 2^32. To find the server for a key, move clockwise on the ring from the key’s position until the first server is encountered; that server stores the key.

“Starting from the key’s position on the ring, the first server encountered clockwise is the server that will cache the object.”

key-1 -> node-1

key-3 -> node-2

key-4 -> node-2

key-5 -> node-2

key-2 -> node-0

Advantages of Consistent Hashing

When adding a new server (e.g., node-4), only the keys that fall between the new server and its predecessor on the ring need to be remapped, affecting a small portion of data. Similarly, if a server fails, only the keys that were mapped to the failed server are reassigned to the next clockwise server, limiting the impact.

Data Skew Problem

With few servers, uneven distribution on the ring can cause data skew, where most keys concentrate on a single node, leading to resource imbalance.

Virtual Nodes

To mitigate skew, each physical server is represented by multiple virtual nodes on the ring. For example, node-1#1, node-1#2, node-1#3 are hashed separately, spreading the load more evenly.

The mapping then becomes key -> virtual node -> real node.

Application Scenarios

Consistent hashing is the preferred algorithm for load balancing in distributed systems and is used in cache middleware such as memcached and redis. It also appears in RPC frameworks (e.g., Dubbo), distributed databases, LVS load balancers, and other systems.

Summary

Consistent hashing provides a scalable way to distribute keys across nodes with minimal data movement during topology changes, though it can incur higher lookup cost with very large rings and introduces a single point of failure if the routing service itself is not highly available.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Distributed Systems Cache load balancing consistent hashing virtual nodes

Written by

IT Services Circle

Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.