Backend Development 10 min read

Benchmarking Fork/Join Framework vs Parallel Streams vs ExecutorService in Java

This article benchmarks Java's Fork/Join framework, Parallel Streams, and ExecutorService across IO‑bound and CPU‑bound workloads on an 8‑core machine, analyzing how thread count and pool configuration affect performance and offering practical recommendations for concurrent Java applications.

Top Architect
Top Architect
Top Architect
Benchmarking Fork/Join Framework vs Parallel Streams vs ExecutorService in Java

The article compares three Java concurrency approaches—ExecutorService, the Fork/Join framework, and Java 8 Parallel Streams—by running two representative tasks (an IO‑bound 6 GB text indexing job and a CPU‑bound large‑integer primality test) under different thread counts (4, 8, 16, 32) on an 8‑core EC2 instance.

For the IO‑bound indexing task, Parallel Streams achieved the best time (≈24.33 s), outperforming Fork/Join by about one second and ExecutorService by a larger margin; performance peaked at 16 threads and degraded at 32 threads due to excess overhead. Using the default Fork/Join pool size (equal to the number of CPU cores) was slower than a custom size, which can be set with the JVM flag -Djava.util.concurrent.ForkJoinPool.common.parallelism=16 .

In the CPU‑bound prime‑checking test, all implementations converged to a similar best time of roughly 28 seconds, with Parallel Streams still slightly ahead. The single‑thread version was about 4.2 times slower, and increasing the thread count beyond the number of physical cores yielded diminishing returns.

Key observations include: (1) too few threads waste CPU cycles, while too many increase contention; (2) Parallel Streams generally provide the best performance but can suffer when the common Fork/Join pool is already saturated; (3) for IO‑heavy workloads, tuning the pool size is crucial; (4) single‑threaded execution is dramatically slower, confirming the benefits of parallelism on multi‑core hardware.

The author advises reading the source code, testing on the target environment, and considering hardware characteristics and overall thread usage before deciding which concurrency mechanism to adopt.

Original test data and source code are available on GitHub, with links provided for readers to reproduce or extend the benchmarks.

JavaperformanceConcurrencyBenchmarkExecutorServiceparallel streamsForkJoin
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.