Backend Development 23 min read

Comparison of Four Consistent Hashing Algorithms: Ketama, Rendezvous, Jump Consistent Hash, and Maglev

The article compares four consistent‑hashing algorithms—Ketama’s ring with virtual nodes, Rendezvous’s highest‑random‑weight method, Google’s Jump Consistent Hash, and Maglev’s lookup‑table approach—evaluating their balance, monotonicity, stability, scalability, and time complexity, and concludes that Ketama and Jump offer the best overall trade‑off.

Tencent Cloud Developer

Jun 16, 2021

Comparison of Four Consistent Hashing Algorithms: Ketama, Rendezvous, Jump Consistent Hash, and Maglev

Consistent hashing, originally proposed by Karger et al. in 1997, is a special hashing technique designed to solve the problem of distributed caching. It is widely used in modern distributed systems because it provides balance, monotonicity, and stability when nodes are added or removed.

Key properties of consistent hashing

Balance : Keys are evenly distributed across all backend nodes after hashing.

Monotonicity : When a new node joins, existing keys either stay on their original node or move to the new node; they never jump to another existing node.

Stability : During scaling operations, the amount of data that needs to be migrated is minimized.

Problem background

Mapping N cache servers with a simple hash(data) % N works only when the number of nodes is static. Any change in N causes massive remapping, which defeats the purpose of a stable distributed cache.

Four common consistent‑hash algorithms

1. Classic Consistent Hash (Ketama)

The classic method maps both nodes and keys onto a 32‑bit ring and walks clockwise to find the first node with a hash greater than or equal to the key’s hash. Virtual nodes improve balance and allow weighting of physical nodes.

#服务器节点例子，第一列为地址，第二列为内存
#------ Server --------Mem-#
#255.255.255.255:6553566666#
10.0.1.1:11211600
10.0.1.2:11211300
10.0.1.3:11211200
10.0.1.4:11211350
10.0.1.5:112111000
10.0.1.6:11211800
10.0.1.7:11211950
10.0.1.8:11211100

typedef struct {
    unsigned int point;  // point on circle
    char ip[22];
} mcs;

typedef struct {
    char addr[22];
    unsigned long memory;
} serverinfo;

typedef struct {
    int numpoints;
    void* modtime;
    void* array; //array of mcs structs
} continuum;

typedef continuum* ketama_continuum;

/** \brief Generates the continuum of servers (each server as many points on a circle). */
static int ketama_create_continuum(key_t key, char* filename)
{
    if (shm_ids == NULL) {
        init_shm_id_tracker();
    }
    if (shm_data == NULL) {
        init_shm_data_tracker();
    }
    int shmid;
    int* data;  /* Pointer to shmem location */
    unsigned int numservers = 0;
    unsigned long memory;
    serverinfo* slist;

    slist = read_server_definitions(filename, &numservers, &memory);
    if (numservers < 1) {
        set_error("No valid server definitions in file %s", filename);
        return 0;
    } else if (slist == 0) {
        return 0;
    }
#ifdef DEBUG
    syslog(LOG_INFO, "Server definitions read: %u servers, total memory: %lu.
", numservers, memory);
#endif
    mcs continuum[numservers * 160];
    unsigned int i, k, cont = 0;
    for (i = 0; i < numservers; i++) {
        float pct = (float)slist[i].memory / (float)memory;
        unsigned int ks = floorf(pct * 40.0 * (float)numservers);
#ifdef DEBUG
        int hpct = floorf(pct * 100.0);
        syslog(LOG_INFO, "Server no. %d: %s (mem: %lu = %u%% or %d of %d)
", i, slist[i].addr, slist[i].memory, hpct, ks, numservers * 40);
#endif
        for (k = 0; k < ks; k++) {
            char ss[30];
            unsigned char digest[16];
            sprintf(ss, "%s-%d", slist[i].addr, k);
            ketama_md5_digest(ss, digest);
            for (int h = 0; h < 4; h++) {
                continuum[cont].point = (digest[3+h*4] << 24) |
                                      (digest[2+h*4] << 16) |
                                      (digest[1+h*4] << 8) |
                                      digest[h*4];
                memcpy(continuum[cont].ip, slist[i].addr, 22);
                cont++;
            }
        }
    }
    free(slist);
    qsort((void*) &continuum, cont, sizeof(mcs), (compfn)ketama_compare);
    return 1;
}

Ketama’s implementation steps are:

Read server definitions (address and memory) from a configuration file; memory determines node weight.

For each server, compute the number of virtual nodes (default 160) based on its weight, generate MD5 hashes for strings like addr‑i, and derive four 32‑bit points from each 16‑byte digest.

Store all points in a continuum array and sort them for binary search.

2. Rendezvous (Highest‑Random‑Weight) Hash

Each key is hashed together with every node to produce a weight wi,j = h(key, nodej). The node with the maximum weight wins. The algorithm is simple, requires little state, but has O(n) complexity.

def determine_responsible_node(nodes, key):
    """Determines which node, of a set of nodes of various weights, is responsible for the provided key."""
    highest_score, champion = -1, None
    for node in nodes:
        score = node.compute_weighted_score(key)
        if score > highest_score:
            champion, highest_score = node, score
    return champion

Rendezvous hash satisfies monotonicity because only the node that obtains the highest weight may change when nodes are added or removed.

3. Jump Consistent Hash

Proposed by Google in 2014, this algorithm uses only a few lines of code, runs in O(log n) time, and requires minimal memory. It maps a 64‑bit key to a bucket number.

int32_t JumpConsistentHash(uint64_t key, int32_t num_buckets) {
    int64_t b = -1, j = 0;
    while (j < num_buckets) {
        b = j;
        key = key * 2862933555777941757ULL + 1;
        j = (b + 1) * (double(1LL << 31) / double((key >> 33) + 1));
    }
    return b;
}

The paper shows that the probability of a key jumping to a new bucket when the number of buckets increases from n to n+1 is 1/(n+1), which is optimal. The algorithm can be further optimized using a random‑number‑based jump calculation:

int ch(int key, int num_buckets) {
    random.seed(key);
    int b = -1; // bucket before previous jump
    int j = 0; // bucket before current jump
    while (j < num_buckets) {
        b = j;
        double r = random.next(); // 0 < r < 1.0
        j = floor((b + 1) / r);
    }
    return b;
}

Jump Consistent Hash has excellent balance and monotonicity, but it only supports adding or removing nodes at the end of the bucket list.

4. Maglev Hash

Introduced by Google in 2016, Maglev builds a fixed‑size lookup table (size M, typically a prime) that maps each slot to a node. The table is generated by assigning each node a permutation list derived from two independent hash functions (offset and skip). The algorithm fills the table by iterating over nodes and placing them at their preferred slots, skipping occupied slots.

The generation process runs in O(M log M) time (worst‑case O(M²)). The resulting table provides very good balance (each node gets roughly M/N slots) and fast O(1) lookup, but when nodes are removed the table may need substantial reshuffling, which slightly harms monotonicity compared with the other three algorithms.

Summary and Comparison

The following table summarizes the four algorithms in terms of scalability, balance, monotonicity, stability, and time complexity:

Algorithm

Scale‑up

Scale‑down

Balance

Monotonicity

Stability

Time Complexity

Ketama

Good

Fair

Good

Fair

O(log(vn))

Rendezvous

Good

Fair

Good

Fair

O(n)

Jump Consistent Hash

Good

Needs extra handling

Good

O(ln n)

Maglev

Fair

Good

Fair

O(M log M) (worst O(M²))

Overall, Ketama and Jump Consistent Hash provide the best trade‑off between balance, monotonicity, and low migration cost. Maglev excels in lookup speed but may require a larger table to mitigate the impact of node failures.

References

https://github.com/RJ/ketama/tree/master/libketama

https://en.wikipedia.org/wiki/Rendezvous_hashing

https://arxiv.org/ftp/arxiv/papers/1406/1406.2294.pdf

https://static.googleusercontent.com/media/research.google.com/zh-CN//pubs/archive/44824.pdf

https://www.eecs.umich.edu/techreports/cse/96/CSE-TR-316-96.pdf

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Distributed Systems load balancing hash algorithms consistent hashing algorithm comparison

Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.