Why Does My ThreadPool Freeze? Uncovering the Hidden Deadlock Pitfall
This article explains a subtle thread‑pool deadlock caused by parent‑child task interactions, demonstrates the issue with a reproducible demo, analyzes why the latch logic fails, and provides a practical solution of isolating thread pools to avoid false‑deadlock behavior in microservices.
Hello everyone, I'm SanYou. I recently hit a tricky thread‑pool pitfall and want to share the details.
Demo
First, here's the initial code:
The program creates a thread pool and submits five tasks in a loop, using StopWatch to measure execution time and a CountDownLatch to wait for all tasks to finish.
Running this version logs five "execution completed" messages and shows a total runtime of about 4 seconds.
Because the data fetched from the database has no dependencies, it can be processed in parallel. The code is therefore rewritten to submit those database‑fetch tasks to the thread pool as well.
After the change, the log shows that the sub‑tasks run concurrently, and the measured time drops dramatically to 9.9 ms.
However, this result is misleading.
The CountDownLatch in the parent task reaches zero as soon as the sub‑tasks are submitted, so await() returns immediately and the stopwatch stops, even though the sub‑tasks are still processing data.
Consequently the reported 9.9 ms only reflects task submission time, not actual processing time.
To obtain a correct measurement, the parent task must wait until all sub‑tasks have truly finished.
The fix is to add a second latch for the sub‑tasks and count it down in each sub‑task's finally block.
countDownLatchSub.await();
With this change, the expected log order becomes:
<code>当前线程pool-1-thread-3,---【任务2】开始执行---<br/>当前线程pool-1-thread-1,【任务2】开始处理数据=1<br/>当前线程pool-1-thread-2,【任务2】开始处理数据=2<br/>当前线程pool-1-thread-3,---【任务2】执行完成---<br/></code>In the original version, however, the parent task waits on countDownLatchSub.await() while all core threads are busy executing the parent task itself, leaving no free threads for the queued sub‑tasks. This creates a circular wait: parent waits for children, children wait for a free thread, resulting in a deadlock‑like "fake freeze".
Where the Pitfall Lies
The demo mirrors a real production issue where a microservice uses a single custom thread pool for multiple API calls that have parent‑child relationships. Because the parent API occupies all core threads, the child API tasks are queued and never get a thread, causing the whole request chain to stall.
API 1 receives a request and hands it to the custom thread pool, then waits for API 2.
API 2 calls API 3 and waits.
API 3 also submits work to the same thread pool, which ends up in the queue.
All core threads are already busy with API 1, so API 3 cannot run.
The system deadlocks.
How to Avoid It
The solution is simple: do not share a single thread pool between parent and child tasks. Allocate a separate thread pool for the child tasks (thread‑pool isolation).
Running the isolated pools version reduces the total execution time to about 2 seconds, and the logs correctly show that the parent task finishes only after all child tasks have completed.
<code>当前线程pool-1-thread-3,---【任务2】开始执行---<br/>当前线程pool-2-thread-1,【任务2】开始处理数据=1<br/>当前线程pool-2-thread-4,【任务2】开始处理数据=2<br/>当前线程pool-1-thread-3,---【任务2】执行完成---<br/></code>In summary, when tasks have a parent‑child relationship, avoid using the same thread pool; otherwise, child tasks may be queued while the parent holds all threads, leading to a false deadlock.
Sanyou's Java Diary
Passionate about technology, though not great at solving problems; eager to share, never tire of learning!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.