Understanding ULID: A Universally Unique Lexicographically Sortable Identifier and Its Python Usage
This article explains what ULID (Universally Unique Lexicographically Sortable Identifier) is, compares it with UUID, details its features, binary layout, and encoding, and provides practical Python examples for generating and manipulating ULIDs in distributed systems.
ULID (Universally Unique Lexicographically Sortable Identifier) is introduced as an alternative to UUID, offering both timestamp‑based ordering and high‑entropy randomness, with a 128‑bit representation that is URL‑safe and lexicographically sortable.
Why not choose UUID? UUID has five versions; version 1 and 2 rely on MAC addresses, version 3 and 5 use hash algorithms that can cause fragmentation, and version 4 is purely random. Even UUID‑4 can collide, whereas ULID combines millisecond‑precision timestamps with 1.21e+24 possible random values per millisecond, eliminating collision risk and producing a more readable string.
ULID Features
128‑bit compatibility with UUID
1.21e+24 unique IDs per millisecond
Lexicographic (alphabetical) sorting
Encoded as 26 characters (vs. 36 for UUID)
Uses Crockford's Base32 for efficiency and readability
Case‑insensitive and URL‑safe (no special characters)
Monotonic ordering within the same millisecond
ULID Specification
The identifier consists of a 48‑bit timestamp (milliseconds since Unix epoch) and an 80‑bit random component. The timestamp provides ordering up to the year 10889, and the random part ensures uniqueness.
Composition
Timestamp
48‑bit integer representing Unix time in milliseconds
Randomness
80‑bit random number, preferably generated with cryptographic quality
Sorting
Characters are ordered left‑to‑right using the default ASCII set; within the same millisecond, ordering is not guaranteed.
Encoding
Uses Crockford's Base32 alphabet (0123456789ABCDEFGHJKMNPQRSTVWXYZ), which omits I, L, O, and U to avoid confusion.
Binary Layout
The 128 bits are encoded as sixteen octets in network byte order (big‑endian).
Application Scenarios
Replace auto‑increment primary keys in databases, removing the need for DB‑side key generation.
In distributed environments, substitute UUIDs with globally unique, millisecond‑ordered IDs.
Use the embedded timestamp for time‑based sharding or partitioning.
Sort records by ULID instead of a separate created_at column when millisecond precision is sufficient.
Python Usage
Install the library:
pip install ulid-pyCreate a new ULID:
import ulid
ulid.new() # → ULID('01BJQE4QTHMFP0S5J153XCFSP9')Generate a ULID from an existing UUID:
import ulid, uuid
value = uuid.uuid4()
ulid.from_uuid(value) # → ULID('09GF8A5ZRN9P1RYDVXV52VBAHS')Create a ULID from a specific timestamp:
import datetime, ulid
ulid.from_timestamp(datetime.datetime(1999, 1, 1)) # → ULID('00TM9HX0008S220A3PWSFVNFEH')Create a ULID from custom randomness bytes:
import os, ulid
randomness = os.urandom(10)
ulid.from_randomness(randomness) # → ULID('01BJQHX2XEDK0VN0GMYWT9JN8S')ULID objects provide methods:
u = ulid.new()
u.timestamp() # returns the 48‑bit timestamp component
u.randomness() # returns the 80‑bit randomness componentSource code is available at github.com/ahawker/ulid .
Java Architect Essentials
Committed to sharing quality articles and tutorials to help Java programmers progress from junior to mid-level to senior architect. We curate high-quality learning resources, interview questions, videos, and projects from across the internet to help you systematically improve your Java architecture skills. Follow and reply '1024' to get Java programming resources. Learn together, grow together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.