Backend Development 27 min read

Consistent Hashing Algorithm: Theory, Go Implementation, and Load-Balanced Extension

The article explains consistent hashing—using a circular 2^32 hash ring with virtual nodes to evenly distribute keys across dynamic cache servers—provides a complete Go implementation including host registration, key lookup, and a bounded‑load extension that tracks server load, demonstrates a proxy‑cache setup, and discusses practical testing and production‑grade enhancements.

Tencent Cloud Developer

Mar 14, 2022

Consistent Hashing Algorithm: Theory, Go Implementation, and Load-Balanced Extension

This article introduces the consistent hashing algorithm, a solution for distributing keys across a dynamic set of cache servers while minimizing remapping when servers are added or removed.

It starts by describing the limitations of simple modulo hashing (hash(key) % N) for cache distribution, especially during scaling events that cause massive cache invalidation.

The core concept of a hash ring is explained: the 2^32 space is visualized as a circle, and both servers (or their virtual nodes) and keys are mapped onto this ring. A key is stored on the first server encountered when moving clockwise from the key's position.

To improve distribution uniformity, virtual nodes are introduced. Each physical server is represented by multiple virtual nodes (e.g., hostReplicaFormat = "%s%d"). The article provides Go structures for hosts and the consistent hash implementation:

type Host struct {
    Name      string
    LoadBound int64
}

type Consistent struct {
    replicaNum      int
    totalLoad       int64
    hashFunc        func(key string) uint64
    hostMap         map[string]*Host
    replicaHostMap  map[uint64]string
    sortedHostsHashSet []uint64
    sync.RWMutex
}

Key functions include:

func (c *Consistent) RegisterHost(hostName string) error { ... }
func (c *Consistent) UnregisterHost(hostName string) error { ... }
func (c *Consistent) GetKey(key string) (string, error) { ... }
func (c *Consistent) searchKey(key uint64) int { ... }

The registration process adds a host and its virtual nodes to the hash ring, sorting the ring after insertion. Unregistration removes the host and its virtual nodes.

A practical demonstration uses HTTP servers as cache nodes and a proxy server that performs key lookups via the consistent hash. Sample Go code for the cache server and proxy server is provided, showing how hosts are registered with the proxy and how keys are fetched:

func (p *Proxy) GetKey(key string) (string, error) {
    host, err := p.consistent.GetKey(key)
    if err != nil { return "", err }
    resp, err := http.Get(fmt.Sprintf("http://%s?key=%s", host, key))
    ...
}

To address load imbalance, the article extends the basic algorithm with a bounded-load variant based on Google’s 2017 paper. It adds load tracking per host and a function GetKeyLeast that searches clockwise for the first server whose load is below a calculated threshold (average load * (1 + loadBoundFactor)).

func (c *Consistent) GetKeyLeast(key string) (string, error) { ... }
func (c *Consistent) checkLoadCapacity(host string) (bool, error) { ... }
func (c *Consistent) Inc(hostName string) { atomic.AddInt64(&c.hostMap[hostName].LoadBound, 1); atomic.AddInt64(&c.totalLoad, 1) }
func (c *Consistent) Done(host string) { atomic.AddInt64(&c.hostMap[host].LoadBound, -1); atomic.AddInt64(&c.totalLoad, -1) }

The proxy server is updated to use GetKeyLeast, increment the host's load before handling a request, and decrement it after a simulated processing time.

Extensive testing steps are described: starting the proxy, launching three cache servers, registering them, and issuing key requests via curl. The results demonstrate even distribution of keys and proper handling of server addition/removal.

Finally, the article notes that while the provided code works for demonstration, production use would require enhancements such as robust service registration, persistent cache storage, health checks, and more sophisticated load management.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Distributed Systems algorithm Golang load balancing Caching consistent hashing virtual nodes

Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.