Databases 7 min read

Comparison of Consistent Hash (Ring Hash) and JumpStringHash for DBLE Sharding

This article explains the principles, characteristics, and performance trade‑offs of the consistent‑hash (ring‑hash) and jumpstringhash sharding algorithms used by DBLE, presents test results on variance, latency and data balance, and concludes why DBLE prefers jumpstringhash.

Aikesheng Open Source Community
Aikesheng Open Source Community
Aikesheng Open Source Community
Comparison of Consistent Hash (Ring Hash) and JumpStringHash for DBLE Sharding

Background

MyCat provides three sharding modes for string‑type fields: modulo hash, jumpstringhash, and consistent hash (ring hash). DBLE inherits MyCat’s modulo hash but recommends the higher‑performance, more balanced jumpstringhash algorithm.

Introduction

The following sections briefly introduce the principles, features, and pros/cons of consistent hash (ring hash) and jumpstringhash.

Consistent Hash (Ring Hash)

The algorithm works by generating a fixed set of identifiers (e.g., SHARD‑1‑NODE‑1) for X × N shards, computing their hash values, sorting them in a map, and then assigning an input string to the shard whose hash is just smaller than the input’s hash.

JumpStringHash

The algorithm distributes data uniformly by:

Using a formula (shown in the image) to generate pseudo‑random values.

For each input string, computing its hash and repeatedly generating pseudo‑random numbers; if a number is less than 1/x for the current node, the node is selected, and the node with the highest index becomes the final placement.

In practice, a reverse‑checking method from the highest node is used: when a generated random value exceeds x, that node is chosen.

Data Comparison

Tests were conducted on 350,595 real data rows to compare variance, time consumption, and maximum deviation of data distribution across different shard counts.

Test Results – Variance

1. Variance decreases as shard count increases for both methods. 2. When the number of shards is large enough, the two methods show little difference. 3. With relatively few nodes, jumpstringhash achieves the best balance. 4. Increasing the number of rings in consistent hash improves balance, but the improvement diminishes as shard count grows.

Test Results – Time Consumption

1. Consistent hash is an order of magnitude slower than jumpstringhash. 2. Jumpstringhash’s time cost grows slightly with shard count. 3. Consistent hash’s time cost also grows slightly as the number of rings increases.

Test Results – Maximum Deviation

1. Maximum deviation for both methods decreases as shard count increases, and differences vanish at high shard counts. 2. With few shards, jumpstringhash provides the best balance, while consistent hash performs worst. 3. Adding more rings to consistent hash improves balance, but the benefit diminishes with more shards.

Conclusion

Overall, consistent hash shows poor performance across all tests, leading DBLE to adopt jumpstringhash as its default sharding algorithm. Users migrating from MyCat who still use consistent hash can implement a custom sharding algorithm.

For custom sharding algorithm details, see: GitHub – DBLE route function spec

ShardingPerformance Testingdatabaseshash algorithmsConsistent HashjumpStringHash
Aikesheng Open Source Community
Written by

Aikesheng Open Source Community

The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.