Backend Development 18 min read

Optimized Snowflake ID Generation with Seata: Solving MyBatis-Plus Duplicate Key Issues

This article analyzes why MyBatis-Plus generates duplicate primary keys in clustered Docker/K8S environments due to worker‑id and datacenter‑id collisions, introduces Seata's improved Snowflake algorithm that decouples from the OS clock, explains its implementation with Java code, and shows how to integrate it as a global ID generator to improve database performance and avoid page splits.

IT Services Circle

Dec 9, 2024

Optimized Snowflake ID Generation with Seata: Solving MyBatis-Plus Duplicate Key Issues

Last week a colleague encountered duplicate primary‑key errors in a MyBatis‑Plus project running on a K8S cluster; the issue stems from the workerId and datacenterId being generated from the JVM name and MAC address, which can collide in Docker environments.

MyBatis‑Plus initializes these IDs via

com.baomidou.mybatisplus.core.toolkit.Sequence.getMaxWorkerId()

and getDatacenterId() methods.

protected long getMaxWorkerId(long datacenterId, long maxWorkerId) {
    StringBuilder mpid = new StringBuilder();
    mpid.append(datacenterId);
    String name = ManagementFactory.getRuntimeMXBean().getName();
    if (StringUtils.isNotBlank(name)) {
        mpid.append(name.split("@")[0]);
    }
    return (long)(mpid.toString().hashCode() & '\uffff') % (maxWorkerId + 1L);
}

protected long getDatacenterId(long maxDatacenterId) {
    // ... omitted ...
    byte[] mac = network.getHardwareAddress();
    if (null != mac) {
        id = (255L & (long)mac[mac.length - 2] | 65280L & (long)mac[mac.length - 1] << 8) >> 6;
        id %= maxDatacenterId + 1L;
    }
    return id;
}

Because the workerId is derived from the JVM name and the datacenterId from the MAC address, deploying the same image on multiple containers often produces identical IDs, leading to primary‑key conflicts.

Instead of fixing the MyBatis‑Plus configuration, the article recommends using an optimized Snowflake algorithm provided by Seata, which eliminates the strong binding to the OS clock and can be dropped into the project as a drop‑in replacement.

The algorithm requires IDs to be globally unique, monotonically increasing, and high‑performance to reduce MySQL InnoDB page splits. While the classic Snowflake algorithm suffers from clock‑rollback issues, Seata’s version stores a combined timestampAndSequence in an AtomicLong, initializing the timestamp once and letting the sequence drive increments.

/**
 * timestamp and sequence mix in one Long
 * highest 11 bit: not used
 * middle  41 bit: timestamp
 * lowest  12 bit: sequence
 */
private AtomicLong timestampAndSequence;
private final int sequenceBits = 12;

private void initTimestampAndSequence() {
    long timestamp = getNewestTimestamp();
    long timestampWithSequence = timestamp << sequenceBits;
    this.timestampAndSequence = new AtomicLong(timestampWithSequence);
}

Node IDs are generated from the lowest 10 bits of the MAC address, guaranteeing a range of 0‑1023, which limits the number of distinct nodes.

private long generateWorkerIdBaseOnMac() throws Exception {
    Enumeration<NetworkInterface> all = NetworkInterface.getNetworkInterfaces();
    while (all.hasMoreElements()) {
        NetworkInterface networkInterface = all.nextElement();
        if (networkInterface.isLoopback() || networkInterface.isVirtual()) continue;
        byte[] mac = networkInterface.getHardwareAddress();
        return ((mac[4] & 0B11) << 8) | (mac[5] & 0xFF);
    }
    throw new RuntimeException("no available mac found");
}

The nextId() method increments the combined value, masks the lower 53 bits, and ORs it with the pre‑shifted workerId to produce the final 64‑bit ID.

public long nextId() {
    // obtain incremented timestamp and sequence
    long next = timestampAndSequence.incrementAndGet();
    // keep low 53 bits
    long timestampWithSequence = next & timestampAndSequenceMask;
    // combine with workerId
    return workerId | timestampWithSequence;
}

Although the Seata variant does not guarantee global monotonicity—different nodes may produce IDs out of chronological order—it ensures each node’s IDs are strictly increasing, which limits B+‑tree page splits after an initial stabilization period.

The article explains B+‑tree page splitting in MySQL and why sequential primary keys (e.g., auto_increment) minimize splits. The Seata algorithm’s per‑node monotonic sequences eventually reach a stable state where further IDs only grow at the tail of each sub‑sequence, reducing split frequency.

Finally, the author integrates the Seata algorithm into the DailyMart project via a custom Spring Boot starter. The utility method IdUtils.nextId() is used throughout, and MyBatis‑Plus’s default identifier generator is replaced with a custom implementation:

public class CustomIdGenerator implements IdentifierGenerator {
    @Override
    public Number nextId(Object entity) {
        return IdUtils.nextId();
    }
}

@Bean
public IdentifierGenerator identifierGenerator() {
    return new CustomIdGenerator();
}

The article cautions that this ID scheme is best suited for tables with long‑term data; frequent deletions may trigger page merges that interfere with the algorithm’s convergence.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java Mybatis-Plus Database Performance distributed-id seata Snowflake algorithm

Written by

IT Services Circle

Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.