Fundamentals 8 min read

How JDK 1.8 Optimizes HashMap Rehashing: Inside the Resize Algorithm

This article explains the JDK 1.8 HashMap resize/re‑hash optimization, showing how the new algorithm splits buckets instead of re‑hashing every entry, and includes the full source code and visual illustrations for single nodes, linked lists, and red‑black trees.

Sanyou's Java Diary

Feb 26, 2022

How JDK 1.8 Optimizes HashMap Rehashing: Inside the Resize Algorithm

We discuss a common interview question: how JDK 1.8 optimizes the HashMap rehash (resize) algorithm.

HashMap is backed by an array, so when the array grows it must be resized. In JDK 1.7 the array size was doubled and every key was re‑hashed and placed into a new array. Starting with JDK 1.8 the rehash process was optimized to avoid re‑hashing each key.

The core of the optimization is the resize() method, which calculates the new capacity, creates a new array, and then redistributes the existing nodes:

final Node<K,V>[] resize() {
    Node<K,V>[] oldTab = table;
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    int oldThr = threshold;
    int newCap, newThr = 0;
    if (oldCap > 0) {
        if (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return oldTab;
        }
        // newCap = oldCap << 1 is the evidence of doubling
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY)
            newThr = oldThr << 1; // double threshold
    } else if (oldThr > 0) // initial capacity was placed in threshold
        newCap = oldThr;
    else {
        newCap = DEFAULT_INITIAL_CAPACITY;
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }
    if (newThr == 0) {
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY) ? (int)ft : Integer.MAX_VALUE;
    }
    threshold = newThr;
    @SuppressWarnings({"rawtypes","unchecked"})
    Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
    table = newTab;
    if (oldTab != null) {
        for (int j = 0; j < oldCap; ++j) {
            Node<K,V> e;
            if ((e = oldTab[j]) != null) {
                oldTab[j] = null;
                if (e.next == null)
                    newTab[e.hash & (newCap - 1)] = e;
                else if (e instanceof TreeNode)
                    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                else {
                    Node<K,V> loHead = null, loTail = null;
                    Node<K,V> hiHead = null, hiTail = null;
                    Node<K,V> next;
                    do {
                        next = e.next;
                        if ((e.hash & oldCap) == 0) {
                            if (loTail == null) loHead = e; else loTail.next = e;
                            loTail = e;
                        } else {
                            if (hiTail == null) hiHead = e; else hiTail.next = e;
                            hiTail = e;
                        }
                    } while ((e = next) != null);
                    if (loTail != null) {
                        loTail.next = null;
                        newTab[j] = loHead;
                    }
                    if (hiTail != null) {
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;
                    }
                }
            }
        }
    }
    return newTab;
}

The algorithm works as follows:

Single node: The node’s hash is recomputed with the new capacity and placed directly into the new array.

Linked list: The list is split into two lists. Nodes whose hash bit (oldCap) is 0 stay at the original index; others move to index + oldCap. This is illustrated by the following diagram:

For linked lists, the split creates two separate chains that are attached to the two new positions.

The same principle applies to red‑black trees: the tree is split into two trees (or possibly converted back to a list) and attached to index and index + oldCap. A diagram of the tree split is shown below:

By splitting buckets instead of re‑hashing every entry, JDK 1.8 dramatically reduces the work required during a resize, leading to a much more efficient HashMap expansion.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

HashMap JDK8 Data Structures Resize Rehash

Written by

Sanyou's Java Diary

Passionate about technology, though not great at solving problems; eager to share, never tire of learning!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.