Backend Development 12 min read

Distributed ID Concepts, Implementation Schemes, and Open‑Source Solutions

This article explains the need for globally unique identifiers in distributed systems, compares common ID generation schemes such as UUID, auto‑increment, Redis counters, and Snowflake, provides a Java implementation of the Snowflake algorithm, and reviews open‑source components like Meituan Leaf and Baidu UidGenerator.

Architecture Digest

Jan 30, 2021

Distributed ID Concepts, Implementation Schemes, and Open‑Source Solutions

1. Distributed ID Concept

In the human world an ID uniquely identifies a person, and in complex distributed systems a similar globally unique identifier is required for massive data and messages. Traditional auto‑increment primary keys work for monolithic databases, but after sharding a globally unique ID is needed, which must also satisfy high concurrency, high availability, and high performance.

2. Distributed ID Implementation Schemes

The table below compares several common solutions:

Description

Advantages

Disadvantages

UUID

Universally Unique Identifier that provides uniqueness without a central coordinator.

1) Reduces pressure on global nodes, faster primary‑key generation; 2) Globally unique; 3) Easy data merging across servers.

1) Occupies 16 characters, high space cost; 2) Not sequential, causing random I/O and lower index efficiency.

Database Auto‑Increment

MySQL auto‑increment primary key.

1) Small INT/BIGINT footprint; 2) Sequential I/O; 3) Numeric queries faster than strings.

1) Limited concurrency, bound by DB performance; 2) Sharding requires redesign; 3) Auto‑increment may expose data volume.

Redis Auto‑Increment

Atomic counter in Redis.

Uses memory, excellent concurrency.

1) Possible data loss; 2) Auto‑increment may expose data volume.

Snowflake Algorithm

Classic Snowflake algorithm for distributed IDs.

1) No external dependencies; 2) High performance.

Clock rollback issues.

Currently two popular distributed ID solutions dominate:

Segment Mode – relies on a database but differs from simple auto‑increment; a segment (e.g., 100 IDs) is allocated at once, greatly improving performance.

Snowflake Algorithm – composed of a sign bit, timestamp, data‑center ID, machine ID, and sequence number, as illustrated below:

The sign bit is 0, indicating a positive number. The timestamp (in milliseconds) records the time. The machine ID is usually split into 5 bits for region and 5 bits for server identifier. The sequence number is an auto‑increment within the same millisecond.

Snowflake capacity: time range 2^41 / (365·24·60·60·1000) ≈ 69 years; worker ID range 2^10 = 1024; sequence range 2^12 = 4096 IDs per millisecond.

The algorithm can be implemented as a simple Java utility, allowing each business service to obtain IDs directly as long as it has a unique machine ID.

public class SnowFlake {
    /** start timestamp */
    private static final long START_STMP = 1480166465631L;
    /** bits allocated to each part */
    private static final long SEQUENCE_BIT = 12; // sequence bits
    private static final long MACHINE_BIT = 5;   // machine bits
    private static final long DATACENTER_BIT = 5; // data‑center bits

    /** max values */
    private static final long MAX_DATACENTER_NUM = -1L ^ (-1L << DATACENTER_BIT);
    private static final long MAX_MACHINE_NUM = -1L ^ (-1L << MACHINE_BIT);
    private static final long MAX_SEQUENCE = -1L ^ (-1L << SEQUENCE_BIT);

    /** left shift values */
    private static final long MACHINE_LEFT = SEQUENCE_BIT;
    private static final long DATACENTER_LEFT = SEQUENCE_BIT + MACHINE_BIT;
    private static final long TIMESTMP_LEFT = DATACENTER_LEFT + DATACENTER_BIT;

    private long datacenterId; // data‑center
    private long machineId;    // machine
    private long sequence = 0L;
    private long lastStmp = -1L;

    public SnowFlake(long datacenterId, long machineId) {
        if (datacenterId > MAX_DATACENTER_NUM || datacenterId < 0) {
            throw new IllegalArgumentException("datacenterId can't be greater than MAX_DATACENTER_NUM or less than 0");
        }
        if (machineId > MAX_MACHINE_NUM || machineId < 0) {
            throw new IllegalArgumentException("machineId can't be greater than MAX_MACHINE_NUM or less than 0");
        }
        this.datacenterId = datacenterId;
        this.machineId = machineId;
    }

    /** generate next ID */
    public synchronized long nextId() {
        long currStmp = getNewstmp();
        if (currStmp < lastStmp) {
            throw new RuntimeException("Clock moved backwards. Refusing to generate id");
        }
        if (currStmp == lastStmp) {
            // same millisecond, increment sequence
            sequence = (sequence + 1) & MAX_SEQUENCE;
            if (sequence == 0L) {
                // sequence overflow, wait for next millisecond
                currStmp = getNextMill();
            }
        } else {
            // different millisecond, reset sequence
            sequence = 0L;
        }
        lastStmp = currStmp;
        return (currStmp - START_STMP) << TIMESTMP_LEFT // timestamp
                | datacenterId << DATACENTER_LEFT   // data‑center
                | machineId << MACHINE_LEFT         // machine
                | sequence;                         // sequence
    }

    private long getNextMill() {
        long mill = getNewstmp();
        while (mill <= lastStmp) {
            mill = getNewstmp();
        }
        return mill;
    }

    private long getNewstmp() {
        return System.currentTimeMillis();
    }

    public static void main(String[] args) {
        SnowFlake snowFlake = new SnowFlake(2, 3);
        for (int i = 0; i < (1 << 12); i++) {
            System.out.println(snowFlake.nextId());
        }
    }
}

3. Open‑Source Distributed ID Components

3.1 How to Choose an Open‑Source Component

Select a component by first confirming that its features meet your requirements, focusing on compatibility and extensibility.

Second, consider your current technical capabilities and whether your team’s stack can integrate the component smoothly.

Third, evaluate the community: update frequency, maintenance status, availability of support, and industry adoption.

3.2 Meituan Leaf

Leaf is a distributed ID service launched by Meituan’s R&D platform, named after Leibniz’s quote “There are no two identical leaves in the world.” It offers high reliability, low latency, and global uniqueness, and is used across Meituan’s finance, food delivery, and travel divisions. The project is open‑source on GitHub.

Globally unique and monotonically increasing.

Highly available; tolerates MySQL outages.

High concurrency with QPS > 50,000 and 99th‑percentile latency < 1 ms on a 4C8G VM.

Simple integration via RPC or HTTP.

3.3 Baidu UidGenerator

UidGenerator is Baidu’s open‑source high‑performance ID generator based on the Snowflake algorithm. It supports customizable worker‑ID bits and initialization strategies, uses future timestamps to avoid sequence bottlenecks, employs a RingBuffer with cache‑line padding to eliminate false sharing, and can reach 6 million QPS on a single machine. The source code is available on GitHub.

3.4 Open‑Source Component Comparison

UidGenerator is Java‑based, last updated two years ago, minimally maintained, and only supports Snowflake.

Leaf is also Java‑based, last maintained in 2020, and supports both segment mode and Snowflake.

Overall, based on theory and feature comparison, Meituan Leaf is the preferable choice.

Do you know other common distributed ID solutions?

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

open-source distributed-id Snowflake algorithm unique identifier

Written by

Architecture Digest

Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.