Fundamentals 6 min read

Why Is fastutil Up to 10× Faster Than Java’s Standard Collections?

fastutil dramatically outperforms Java’s standard collections by eliminating boxing, cutting memory usage and GC pressure, offering primitive‑specific maps and lists, providing 64‑bit indexed BigArrays for massive data, and delivering faster I/O utilities, all while remaining API‑compatible.

Java Backend Technology
Java Backend Technology
Java Backend Technology
Why Is fastutil Up to 10× Faster Than Java’s Standard Collections?

Many developers assume that Java collections such as HashMap cannot be optimized because they already use the standard library. fastutil, a library created by University of Milan professor Sebastiano Vigna, demonstrates that the performance gap can reach tenfold.

Why the standard library is slower

Java generics do not support primitive types. When an int, long or double is stored in a HashMap, the JVM boxes the value into an Integer, Long or Double object. This introduces two problems:

Extra memory overhead : a raw int occupies 4 bytes, but an Integer object with header consumes about 16 bytes. Storing 100 million integers therefore needs roughly 1.6 GB with the standard library, while fastutil needs only about 400 MB.

GC pressure : the large number of temporary wrapper objects triggers frequent garbage collection, causing pauses in latency‑sensitive systems.

fastutil’s solution

The core idea is to implement a separate collection class for each primitive type, completely avoiding boxing. Examples include Int2IntOpenHashMap for int→int maps, LongOpenHashSet for long sets, and DoubleArrayList for double lists. The naming convention follows the pattern type + data‑structure, making it easy to locate documentation.

// Standard library: stores Integer objects, incurs boxing
Map<Integer, Integer> standard = new HashMap<>();
standard.put(1, 100); // boxing

// fastutil: stores raw int, no object creation
Int2IntMap fast = new Int2IntOpenHashMap();
fast.put(1, 100); // no boxing

fastutil’s interfaces are fully compatible with Map, so existing code can switch to fastutil without changing business logic.

Big data structures

The standard Java array index is limited to Integer.MAX_VALUE (≈2.1 billion elements), which is insufficient for many large‑scale scenarios. Since version 6.0 fastutil provides the “Big” series, using 64‑bit indices to store data until memory is exhausted. BigArrays implements an “array of arrays” with a 64‑bit access API that feels like a normal array.

// Access more than 2^31 elements
long size = 3_000_000_000L;
long[][] bigArray = LongBigArrays.newBigArray(size);
BigArrays.set(bigArray, 2_500_000_000L, 42L);

Frameworks such as Elasticsearch and Lucene, which handle massive datasets, employ similar techniques under the hood.

I/O utilities

fastutil also ships with binary and text I/O helpers that outperform the standard BufferedReader / DataInputStream pair, making them suitable for data‑intensive workloads.

How to add fastutil

For Maven users, add the dependency:

<dependency>
    <groupId>it.unimi.dsi</groupId>
    <artifactId>fastutil</artifactId>
    <version>8.5.14</version>
</dependency>

Note that the full JAR is large because it contains implementations for every primitive‑type combination, generated via a C pre‑processor. To reduce size, you can either depend on fastutil-core, which includes only int, long, and double types, or run the provided find-deps.sh script (which uses the JDK‑8 jdeps tool) to create a minimal JAR containing only the classes actually used.

Project repository:

https://github.com/vigna/fastutil
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Performancememory optimizationJava collectionsBigArraysfastutilprimitive collections
Java Backend Technology
Written by

Java Backend Technology

Focus on Java-related technologies: SSM, Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading. Occasionally cover DevOps tools like Jenkins, Nexus, Docker, and ELK. Also share technical insights from time to time, committed to Java full-stack development!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.