Fundamentals 17 min read

Understanding the Underlying Data Structure and Mechanics of Java HashMap

This article explains Java HashMap's internal structure, including the array‑linked list and red‑black tree design, load factor, collision resolution methods, resizing process, key selection, thread‑safety concerns, and the hash calculation algorithm, providing detailed insights for developers.

Java Captain
Java Captain
Java Captain
Understanding the Underlying Data Structure and Mechanics of Java HashMap

HashMap's underlying data structure differs between JDK 1.7 (array + linked list) and JDK 1.8 (array + linked list + red‑black tree), where long chains are converted to trees when the chain length exceeds eight and the table size is greater than 64, improving search complexity from O(n) to O(log n).

Key characteristics of HashMap include unordered storage, allowance of a single null key and multiple null values, unique keys, and a threshold‑based treeification rule (chain length > 8 and table length > 64). The default load factor is 0.75, balancing space and time efficiency.

Hash collisions are resolved using chaining (linked lists), while JDK 1.8 may transform a chain into a red‑black tree under the aforementioned conditions; other collision‑resolution techniques such as open addressing, double hashing, and overflow areas are also described.

The load factor of 0.75 is chosen because it provides a good trade‑off: a smaller factor wastes memory, while a larger factor increases the likelihood of long chains and thus reduces lookup performance.

HashMap computes the hash index by invoking hashCode() , performing an unsigned right‑shift of 16 bits, XOR‑ing the result with the original hash, and finally applying a bitwise AND with (length‑1) . Alternative hash algorithms like the mid‑square method, modulo, and pseudo‑random generators are mentioned.

When two objects have identical hashCode() values, a collision occurs; HashMap first checks equals() to decide whether to replace the existing entry or to append the new entry to the chain (or tree if the threshold is met).

The put operation follows these steps: compute hash, locate bucket, initialize table if empty, handle direct insertion, replace value if key exists, manage chain growth, convert to tree when appropriate, and finally link new nodes in a tree if already treeified.

Resizing doubles the table size once the number of entries exceeds threshold = capacity × loadFactor . The source code for the resize() method is shown below, illustrating how old entries are redistributed between low‑order and high‑order buckets based on the new capacity's additional bit.

final Node<K,V>[] resize() {
    Node<K,V>[] oldTab = table;
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    int oldThr = threshold;
    int newCap, newThr = 0;
    if (oldCap > 0) {
        if (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return oldTab;
        } else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY)
            newThr = oldThr << 1;
    } else if (oldThr > 0)
        newCap = oldThr;
    else {
        newCap = DEFAULT_INITIAL_CAPACITY;
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }
    if (newThr == 0) {
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ? (int)ft : Integer.MAX_VALUE);
    }
    threshold = newThr;
    @SuppressWarnings({"rawtypes","unchecked"})
    Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
    table = newTab;
    if (oldTab != null) {
        for (int j = 0; j < oldCap; ++j) {
            Node<K,V> e;
            if ((e = oldTab[j]) != null) {
                oldTab[j] = null;
                if (e.next == null)
                    newTab[e.hash & (newCap - 1)] = e;
                else if (e instanceof TreeNode)
                    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                else {
                    Node<K,V> loHead = null, loTail = null;
                    Node<K,V> hiHead = null, hiTail = null;
                    Node<K,V> next;
                    do {
                        next = e.next;
                        if ((e.hash & oldCap) == 0) {
                            if (loTail == null) loHead = e; else loTail.next = e;
                            loTail = e;
                        } else {
                            if (hiTail == null) hiHead = e; else hiTail.next = e;
                            hiTail = e;
                        }
                    } while ((e = next) != null);
                    if (loTail != null) { loTail.next = null; newTab[j] = loHead; }
                    if (hiTail != null) { hiTail.next = null; newTab[j + oldCap] = hiHead; }
                }
            }
        }
    }
    return newTab;
}

Typical keys are immutable objects such as Integer or String , because their hash codes are stable and their equals() implementations are reliable.

The treeify threshold of eight nodes is based on empirical analysis: tree nodes consume roughly twice the memory of list nodes, and the probability of a bucket reaching length eight under a good hash distribution follows a Poisson distribution with an extremely low chance, making eight an optimal trade‑off between memory use and lookup speed.

HashMap is not thread‑safe; concurrent resizing can create circular linked lists, simultaneous put operations may overwrite entries, and interleaved put / get during rehashing can return null.

Finally, the hash function XORs the high 16 bits with the low 16 bits to ensure that both parts of the original hash contribute to the final index, reducing collisions when the table size is a power of two.

JavaJDKData StructureHashMapcollision resolutionred-black treeload factor
Java Captain
Written by

Java Captain

Focused on Java technologies: SSM, the Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading; occasionally covers DevOps tools like Jenkins, Nexus, Docker, ELK; shares practical tech insights and is dedicated to full‑stack Java development.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.