Fundamentals 6 min read

Why Threads Alone Struggle to Achieve Million‑Scale Concurrency and How Coroutines Provide a Better Solution

The article examines why relying solely on threads cannot achieve single‑machine million‑level concurrency, analyzing thread resource consumption, context‑switch overhead, and contrasting it with user‑space coroutine scheduling that offers predictable, low‑cost switches, making coroutines better suited for IO‑intensive high‑concurrency scenarios.

Java Tech Enthusiast
Java Tech Enthusiast
Java Tech Enthusiast
Why Threads Alone Struggle to Achieve Million‑Scale Concurrency and How Coroutines Provide a Better Solution

The well‑known C10K problem—how a single machine can handle ten thousand concurrent connections—spurred the creation of I/O multiplexing mechanisms such as epoll and kqueue.

In the evolution of this technology, the limitations of the thread model become increasingly apparent, especially when trying to reach single‑machine million‑level concurrency.

Thread resource consumption consists mainly of memory usage and context‑switch costs. Many sources claim that a thread’s stack is measured in megabytes while a coroutine’s stack is only kilobytes, suggesting that threads would exhaust memory at high concurrency. However, this is a simplification.

The stack size reported for threads (e.g., 8 MB virtual memory per thread on Linux) consumes virtual address space, not necessarily physical memory. On a 32‑bit system, 512 threads would already exhaust the 4 GB address space. On 64‑bit systems the limit is often imposed by kernel parameters such as vm.max_map_count .

If we relax the memory limits and assume a thread’s stack only uses 1 KB of physical memory during execution, the actual memory footprint becomes comparable to that of a coroutine; the remaining virtual space is merely address reservation.

Thus, user‑space stack memory is not the primary bottleneck. The other major issue is the cost of thread context switches. Thread switching relies on pre‑emptive scheduling in the kernel, requiring a transition to kernel mode, saving and restoring the full execution context, and often contending for locks. At a million concurrent threads, the CPU time wasted on these operations becomes a serious performance bottleneck.

Coroutines, by contrast, perform switching entirely in user space, avoiding kernel entry. A coroutine switch only saves a few registers, eliminating the need for full context restoration and kernel involvement.

Traditional threads use pre‑emptive scheduling, where the operating system can interrupt a thread at any moment. This approach is necessary for general‑purpose OSes but incurs high context‑switch overhead.

Coroutines adopt cooperative scheduling: they voluntarily yield the CPU, typically when performing I/O, calling a yield function, or waiting on a lock or other synchronization primitive.

During I/O operations (e.g., network requests)

When explicitly invoking a yield function

When waiting for a lock or other synchronization primitive

The advantages of cooperative scheduling are:

Predictable switch points: switches occur at clearly defined locations in the code.

Avoidance of unnecessary switches: only when the coroutine truly needs to wait.

Reduced complexity of synchronization primitives: many cases can avoid complex locking mechanisms.

This does not mean cooperative scheduling is universally superior; its benefits are most evident in I/O‑intensive applications, where it enables easy achievement of million‑level concurrency on a single machine.

In summary, threads are not inherently weak; rather, in certain scenarios coroutines reshape the rules, offering a more efficient path to extreme concurrency.

concurrencyhigh concurrencyschedulingCoroutinesThreads
Java Tech Enthusiast
Written by

Java Tech Enthusiast

Sharing computer programming language knowledge, focusing on Java fundamentals, data structures, related tools, Spring Cloud, IntelliJ IDEA... Book giveaways, red‑packet rewards and other perks await!

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.