Fundamentals 17 min read

Why LevelDB Uses SkipLists: Memtable Mechanics and a Complete SkipListMap Implementation

This article explains the role and data structure of Memtable in LevelDB, introduces the probabilistic SkipList design, analyzes its time and space complexities, and provides a full Java implementation of a SkipListMap with code examples and a brief look at LevelDB's own SkipList implementation.

Trip Tech Team
Trip Tech Team
Trip Tech Team
Why LevelDB Uses SkipLists: Memtable Mechanics and a Complete SkipListMap Implementation

1: Role and Data Structure of Memtable in LevelDB

LevelDB writes first to a WAL and then to an in‑memory Memtable; when the Memtable reaches its default size (4 MiB) it becomes immutable and is compacted into an SSTable on disk. The Memtable therefore serves as a fast, ordered, in‑memory table that buffers writes before they are persisted, also accelerating minor compactions.

To support fast queries and ordered iteration, the Memtable must have low lookup complexity and maintain order. While balanced trees (e.g., red‑black trees, B‑trees) satisfy these requirements, LevelDB’s authors chose a SkipList instead.

2: Principle of SkipList

SkipList was proposed by William Pugh in 1989 as a probabilistic alternative to balanced trees. The original abstract states:

Skip lists are a data structure that can be used in place of balanced trees. Skip lists use probabilistic balancing rather than strictly enforced balancing and as a result the algorithms for insertion and deletion in skip lists are much simpler and significantly faster than equivalent algorithms for balanced trees.

In other words, a SkipList replaces balanced trees with a structure that achieves similar logarithmic performance using random‑level pointers, making insertion and deletion simpler and faster.

The basic idea starts from an ordered linked list (O(n) search). By adding a level‑1 forward pointer that skips one node, the expected number of hops halves. Adding higher‑level pointers that skip increasingly many nodes yields a hierarchy of at most log₂ n levels, allowing searches in O(log n) time with only O(log n) horizontal steps.

2.1 Simple Time‑Complexity Analysis

If each index level skips K nodes, the height of the tower is logₖ n and each level traverses at most K nodes, giving a worst‑case search cost of K·logₖ n, which is O(log n).

2.2 Space‑Complexity Analysis

With n elements and a skip factor K, level i contains n/Kⁱ nodes. Summing the geometric series yields a total node count of n·(1 / (K‑1))·(1 ‑ 1/K^h) ≈ n·K/(K‑1) for large n, i.e., linear space overhead.

3: Implementing a SkipListMap

1. Define Node and Attributes

<code>/**
 * Node
 * @param <K>
 * @param <V>
 */
private static final class Node<K, V> implements Map.Entry<K, V> {
    final K key;
    V value;
    /** current node level */
    int level;
    /** forward pointers for each level */
    Node<K, V>[] forwards;
    @SuppressWarnings("unchecked")
    Node(K key, V value, int level) {
        this.key = key;
        this.value = value;
        this.level = level;
        this.forwards = new Node[MAX_LEVEL];
    }
    // other methods omitted
}
</code>

2. randomLevel() Method

<code>/**
 * Obtain a random level for a new node.
 * @return the level of the new node
 */
private int randomLevel() {
    int newLevel = 1;
    while (newLevel < MAX_LEVEL && ThreadLocalRandom.current().nextInt() % BRANCH == 0) {
        newLevel++;
    }
    if (newLevel > this.currentMaxLevel) {
        this.currentMaxLevel = newLevel;
    }
    return newLevel;
}
</code>

3. Search Method

<code>@Override
public V get(Object key) {
    Objects.requireNonNull(key);
    Node<K, V> c = dummyHead;
    for (int i = currentMaxLevel - 1; i >= 0; i--) {
        while (c.forwards[i] != null) {
            @SuppressWarnings("unchecked")
            int cmp = c.forwards[i].key.compareTo((K) key);
            if (cmp < 0) {
                c = c.forwards[i];
            } else if (cmp == 0) {
                return c.forwards[i].value;
            } else {
                break;
            }
        }
    }
    return null;
}
</code>

4. Insert Method

<code>@Override
public V put(K key, V value) {
    Objects.requireNonNull(key);
    int newNodeLevel = randomLevel();
    Node<K, V> c = dummyHead;
    Node<K, V>[] backwards = new Node[newNodeLevel];
    for (int i = currentMaxLevel - 1; i >= 0; i--) {
        while (c.forwards[i] != null) {
            int cmp = c.forwards[i].key.compareTo(key);
            if (cmp < 0) {
                c = c.forwards[i];
            } else if (cmp == 0) {
                V old = c.forwards[i].value;
                c.forwards[i].value = value;
                return old;
            } else {
                break;
            }
        }
        if (i <= newNodeLevel - 1) {
            backwards[i] = c;
        }
    }
    Node<K, V> newNode = new Node<>(key, value, newNodeLevel);
    for (int i = 0; i < newNodeLevel; i++) {
        newNode.forwards[i] = backwards[i].forwards[i];
        backwards[i].forwards[i] = newNode;
    }
    size++;
    return null;
}
</code>

5. Ordered Traversal / Range Query

Because a SkipList is a multi‑level linked list, iterating from the first node yields an ordered traversal. The implementation provides an iterator that walks the list level‑0 forward pointers, and LevelDB’s version adds a Seek method for fast positioning to the first key ≥ a target.

<code>public Iterator<Map.Entry<K, V>> iterator() {
    return new SkipListMapIterator<>(0, this.dummyHead);
}

private static final class SkipListMapIterator<K, V> implements Iterator<Map.Entry<K, V>> {
    private final int level;
    private Node<K, V> c;
    public SkipListMapIterator(int level, Node<K, V> dummyHead) {
        this.level = level;
        this.c = dummyHead;
    }
    @Override
    public boolean hasNext() {
        return this.c.forwards[level] != null;
    }
    @Override
    public Entry<K, V> next() {
        Node<K, V> next = this.c.forwards[level];
        this.c = next;
        return next;
    }
}
</code>

4: SkipList Implementation in LevelDB

LevelDB’s SkipList lives in a single header file (skiplist.h) and mirrors the Java version shown above. The iterator includes methods such as Valid(), key(), Next(), Prev(), Seek(target), SeekToFirst(), and SeekToLast(). The core Seek implementation finds the first node whose key is greater than or equal to the target and positions the iterator there.

<code>// Iteration over the contents of a skip list
class Iterator {
public:
    explicit Iterator(const SkipList* list);
    bool Valid() const;
    const Key& key() const; // REQUIRES: Valid()
    void Next();            // REQUIRES: Valid()
    void Prev();            // REQUIRES: Valid()
    void Seek(const Key& target);
    void SeekToFirst();
    void SeekToLast();
private:
    const SkipList* list_;
    Node* node_;
};

template <typename Key, class Comparator>
inline void SkipList<Key, Comparator>::Iterator::Seek(const Key& target) {
    node_ = list_->FindGreaterOrEqual(target, nullptr);
}

template <typename Key, class Comparator>
typename SkipList<Key, Comparator>::Node*
SkipList<Key, Comparator>::FindGreaterOrEqual(const Key& key, Node** prev) const {
    Node* x = head_;
    int level = GetMaxHeight() - 1;
    while (true) {
        Node* next = x->Next(level);
        if (KeyIsAfterNode(key, next)) {
            x = next;
        } else {
            if (prev != nullptr) prev[level] = x;
            if (level == 0) return next;
            else level--;
        }
    }
}
</code>

These methods enable fast random access and range scans, which are heavily used in LevelDB’s read path.

Memtable diagram
Memtable diagram
SkipList structure
SkipList structure
JavaalgorithmData StructureLevelDBSkipListMemtable
Trip Tech Team
Written by

Trip Tech Team

Trip.com technology team, focused on internationalization and localization solutions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.