Why Java 8 Parallel Stream May Not Speed Up Your Tasks and How to Fix It
This article explains how Java 8 Parallel Stream uses the common ForkJoinPool, why it may not improve performance for large proxy‑IP validation tasks, and demonstrates a custom ForkJoinPool solution that dramatically reduces execution time.
Introduction
Java 8 introduced the Stream API, allowing declarative data processing. ParallelStream runs a stream in parallel using a ForkJoinPool, which can improve speed.
Example 1 (List)
<code>Arrays.asList(1,2,3,4,5,6)
.parallelStream()
.forEach((value) -> {
String name = Thread.currentThread().getName();
System.out.println("Example1 Thread:" + name + " value:" + value);
});</code>Example 2 (Array)
<code>Stream.of(1,2,3,4,5,6)
.parallel()
.forEach((value) -> {
String name = Thread.currentThread().getName();
System.out.println("Example2 Thread:" + name + " value:" + value);
});</code>Problem
The author uses Parallel Stream to validate many proxy IPs with a 2‑second timeout per request. With 1000 IPs, a single thread would take over half an hour. Parallel Stream was expected to speed this up, but tasks piled up and the 5‑minute job never finished.
Output shows that Parallel Stream uses the common ForkJoinPool, so all concurrent streams share the same pool, causing contention with other tasks.
Solution
Create a custom ForkJoinPool and run the parallel stream within it, which isolates the workload and improves throughput.
<code>// Custom thread pool
ForkJoinPool forkJoinPool = new ForkJoinPool(8);
// Assume records is a list of proxy IPs fetched from the database
List<ProxyList> records = new ArrayList<>();
// Find invalid proxy IPs using the custom pool
List<String> needDeleteList = forkJoinPool.submit(() ->
records.parallelStream()
.map(ProxyList::getIpPort)
.filter(IProxyListTask::isFailed)
.collect(Collectors.toList())
).join();
// Delete invalid proxies (implementation omitted)</code>After using the custom pool, the previously 5‑minute job completes in under 2 minutes.
Conclusion
Java 8 parallel streams simplify multithreading for large data sets, but for time‑consuming tasks you may need a custom thread pool to achieve noticeable performance gains.
Java Architecture Diary
Committed to sharing original, high‑quality technical articles; no fluff or promotional content.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.