Backend Development 10 min read

Understanding Distributed ID Generation and Snowflake Algorithm in Java

This article explains the concept of distributed unique identifiers, compares segment‑mode and Snowflake implementations, provides a full Java Snowflake code example, and reviews open‑source solutions such as Meituan Leaf and Baidu UidGenerator, offering guidance on selecting and using these ID generation tools in backend systems.

Selected Java Interview Questions

May 12, 2021

Understanding Distributed ID Generation and Snowflake Algorithm in Java

1. Distributed ID Concept

In distributed systems a unique identifier (ID) must be globally unique, similar to a personal ID in the real world. When data is sharded across multiple databases, an auto‑increment primary key is insufficient, so a distributed ID is required that supports high concurrency, high availability, and high performance.

2. Distributed ID Implementation Options

Common solutions are compared in a table (image omitted). Two popular approaches are the segment mode and the Snowflake algorithm.

Segment mode relies on a database but differs from simple auto‑increment: a range (e.g., 100, 200, 300) is allocated, allowing a batch of IDs to be fetched at once, which greatly improves performance.

Snowflake algorithm composes an ID from a sign bit, timestamp, data‑center ID, machine ID, and sequence number, as illustrated in the diagram (image omitted). The sign bit is 0 for positive IDs. The timestamp is stored in milliseconds. The machine ID typically consists of 5 bits for the region and 5 bits for the server identifier. The sequence number is an auto‑increment within the same millisecond.

Snowflake capacity: time range 2^41 ms ≈ 69 years; worker range 2^10 = 1024; sequence range 2^12 = 4096 (up to 4096 IDs per millisecond).

The algorithm can be implemented in Java as a utility class, allowing each business service to obtain IDs by providing its own machine ID.

public class SnowFlake {
    /**
     * Start timestamp
     */
    private final static long START_STMP = 1480166465631L;
    /**
     * Bits allocated for each part
     */
    private final static long SEQUENCE_BIT = 12; // sequence bits
    private final static long MACHINE_BIT = 5;   // machine bits
    private final static long DATACENTER_BIT = 5; // data‑center bits
    /**
     * Max values for each part
     */
    private final static long MAX_DATACENTER_NUM = -1L ^ (-1L << DATACENTER_BIT);
    private final static long MAX_MACHINE_NUM = -1L ^ (-1L << MACHINE_BIT);
    private final static long MAX_SEQUENCE = -1L ^ (-1L << SEQUENCE_BIT);
    /**
     * Left shift for each part
     */
    private final static long MACHINE_LEFT = SEQUENCE_BIT;
    private final static long DATACENTER_LEFT = SEQUENCE_BIT + MACHINE_BIT;
    private final static long TIMESTMP_LEFT = DATACENTER_LEFT + DATACENTER_BIT;

    private long datacenterId; // data‑center
    private long machineId;    // machine identifier
    private long sequence = 0L; // sequence within the same ms
    private long lastStmp = -1L; // last timestamp

    public SnowFlake(long datacenterId, long machineId) {
        if (datacenterId > MAX_DATACENTER_NUM || datacenterId < 0) {
            throw new IllegalArgumentException("datacenterId can't be greater than MAX_DATACENTER_NUM or less than 0");
        }
        if (machineId > MAX_MACHINE_NUM || machineId < 0) {
            throw new IllegalArgumentException("machineId can't be greater than MAX_MACHINE_NUM or less than 0");
        }
        this.datacenterId = datacenterId;
        this.machineId = machineId;
    }

    /**
     * Generate next ID
     */
    public synchronized long nextId() {
        long currStmp = getNewstmp();
        if (currStmp < lastStmp) {
            throw new RuntimeException("Clock moved backwards. Refusing to generate id");
        }
        if (currStmp == lastStmp) {
            // same millisecond, increment sequence
            sequence = (sequence + 1) & MAX_SEQUENCE;
            if (sequence == 0L) {
                // sequence overflow, wait for next millisecond
                currStmp = getNextMill();
            }
        } else {
            // different millisecond, reset sequence
            sequence = 0L;
        }
        lastStmp = currStmp;
        return (currStmp - START_STMP) << TIMESTMP_LEFT
                | datacenterId << DATACENTER_LEFT
                | machineId << MACHINE_LEFT
                | sequence;
    }

    private long getNextMill() {
        long mill = getNewstmp();
        while (mill <= lastStmp) {
            mill = getNewstmp();
        }
        return mill;
    }

    private long getNewstmp() {
        return System.currentTimeMillis();
    }

    public static void main(String[] args) {
        SnowFlake snowFlake = new SnowFlake(2, 3);
        for (int i = 0; i < (1 << 12); i++) {
            System.out.println(snowFlake.nextId());
        }
    }
}

3. Open‑Source Distributed ID Components

3.1 How to Choose an Open‑Source Component

Select a component based on whether its features meet your requirements, focusing on compatibility and extensibility.

Consider your team's technical stack and ability to integrate the component smoothly.

Evaluate the community: update frequency, maintenance status, support availability, and industry adoption.

3.2 Meituan Leaf

Leaf, released by Meituan's core R&D platform, provides a distributed ID service with high reliability, low latency, and global uniqueness. It is widely used across Meituan's finance, delivery, and travel services. The source code is available on GitHub.

Globally unique IDs with monotonically increasing order.

High availability; can tolerate temporary MySQL outages.

High concurrency and low latency (QPS > 50k, 99th percentile < 1 ms on a 4C8G VM).

Simple integration via RPC or HTTP.

3.3 Baidu UidGenerator

UidGenerator is Baidu's open‑source high‑performance ID generator based on the Snowflake algorithm. It supports customizable worker‑ID bits and initialization strategies, making it suitable for containerized environments where instances may restart or drift.

It uses a future‑time borrowing technique and a RingBuffer to cache generated IDs, achieving up to 6 million QPS on a single machine.

3.4 Comparison of Open‑Source Components

Baidu UidGenerator is Java‑based, last updated two years ago, minimally maintained, and only supports Snowflake.

Meituan Leaf is also Java‑based, last maintained in 2020, and supports both segment mode and Snowflake.

Overall, Meituan Leaf is the more robust choice.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java snowflake distributed-id unique identifier

Written by

Selected Java Interview Questions

A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.