Backend Development 9 min read

Performance Comparison of Java Fork/Join, ExecutorService, and Parallel Streams under Different Configurations

This article presents a comprehensive performance evaluation of Java's ExecutorService, Fork/Join framework, and Parallel Streams across CPU‑intensive and I/O‑intensive tasks, analyzing thread‑count effects, default pool limitations, and JVM tuning to guide optimal concurrency choices.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Performance Comparison of Java Fork/Join, ExecutorService, and Parallel Streams under Different Configurations

Java 8's Parallel Stream feature, like a double‑edged lightsaber, offers exciting syntactic sugar but brings both benefits and risks; the article aims to reveal the performance gains and pitfalls of using Parallel Streams compared with traditional concurrency mechanisms.

Historically, Java concurrency relied on third‑party libraries until Java 5 introduced the java.util.concurrent package with ExecutorService , and Java 7 added the Fork/Join framework built on top of ExecutorService; Java 8 further simplified Fork/Join usage via Parallel Streams.

The authors designed two benchmark tasks—a CPU‑bound task and an I/O‑bound task—and executed them under four configurations using thread counts of 4, 8, 16, and 32 on an 8‑core machine, also measuring a single‑thread baseline for reference.

In a large‑scale indexing test (≈6 GB, 5.8 million lines), the single‑thread version took 176,267 ms (~3 min). Results showed that too few threads waste CPU, while too many introduce overhead, causing performance degradation at 32 threads.

Parallel Streams consistently delivered the best performance, beating Fork/Join by about one second and completing the 6 GB indexing in 24.33 seconds, demonstrating superior efficiency over both ExecutorService and Fork/Join.

However, for I/O‑heavy workloads the default Parallel Stream pool (size equal to CPU cores) performed poorly, being up to 7 % slower than a custom 16‑thread pool; increasing the thread pool size mitigated this issue.

Thread‑pool size can be adjusted via the JVM argument -Djava.util.concurrent.ForkJoinPool.common.parallelism=16 , or by supplying a custom Fork/Join pool to override the shared common pool.

Overall, the single‑thread implementation was 7.25 times slower than the fastest parallel result, confirming near‑linear scalability on the 8‑core hardware.

A second benchmark tested primality of a 19‑digit number (1,530,692,068,127,007,263). The single‑thread run took 118,127 ms, while parallel implementations converged around 28 seconds, with little difference between 8 and 16 threads for non‑I/O work.

Key observations include: (1) appropriate thread count is crucial; (2) Parallel Streams generally outperform other approaches but can suffer with default pool sizes on I/O‑bound tasks; (3) custom pool sizing or a dedicated Fork/Join pool can improve results; (4) single‑thread performance remains significantly lower.

The authors conclude that developers should empirically test concurrency strategies in their own environments, considering hardware characteristics, total thread count, shared Fork/Join pools, and other running code before committing to a specific approach.

All tests were conducted on an EC2 c3.2xlarge instance with 8 virtual CPUs (4 physical cores with hyper‑threading) and 15 GB RAM; each implementation was executed ten times, averaging runs 2‑9, for a total of 260 runs, ensuring reliable timing measurements.

JavaPerformanceConcurrencyParallelStreamExecutorServiceForkJoin
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.