Why a 2 TB PostgreSQL Instance Still Runs Out of Memory: Hidden work_mem Pitfalls
Even with 2 TB of RAM, PostgreSQL can hit OOM because work_mem is a per‑operation budget that multiplies across sorts, hashes, parallel workers and long‑lived memory contexts, so blindly raising it often hides deeper query‑design and statistics issues.
Despite a 2 TB memory pool, a PostgreSQL cluster can be killed by the OOM killer when the work_mem setting is misunderstood; the parameter does not cap total memory per query or per backend.
What work_mem actually controls
The official documentation states that work_mem defines the per‑operation memory limit before writing temporary files , covering sorts and hash tables. In complex queries multiple nodes may allocate memory simultaneously, and the total consumption can be many times the work_mem value. The hash memory multiplier ( hash_mem_multiplier, default 2.0) further doubles hash‑related usage, while each parallel worker repeats the allocation, so a 4‑worker query can consume roughly five times the single‑worker amount.
Common misconception
Many assume that setting work_mem to a low value (e.g., 2 MB) guarantees a query will never exceed that amount, but the parameter is merely a budget for an individual execution node, not a hard cap for the whole backend.
Real‑world incident
A production incident showed a 2 TB RAM server OOM‑killed while work_mem was only 2 MB. Using pg_log_backend_memory_contexts, the logged memory contexts revealed:
ExecutorState ≈ 235 MB
HashTableContext ≈ 340 MB
Total ≈ 557 MB
524,059 memory chunks allocated
This demonstrates that memory allocated in long‑lived contexts is not released until the query finishes, causing a snowball effect.
Why memory balloons
PostgreSQL frees memory by whole memory contexts rather than per object. When allocations are attached to a long‑lived context such as ExecutorState, they persist for the entire execution, especially if the query mixes many sorts, hashes, parallel workers, and functions that keep intermediate results alive.
Practical guidance
Instead of globally increasing work_mem, consider a layered approach:
Control concurrency and parallelism.
Keep statistics up‑to‑date.
Rewrite SQL to avoid overly long execution lifetimes (e.g., avoid nesting functions, CTEs, and massive joins).
Use role‑ or database‑specific settings rather than a single global value.
Set statement_timeout for high‑traffic paths.
Monitor memory contexts with pg_log_backend_memory_contexts(pid) and the view pg_backend_memory_contexts (requires superuser or pg_read_all_stats).
When raising work_mem makes sense
If the following conditions hold, increasing work_mem can be beneficial:
Low and stable concurrency.
Analytical workload with expensive sorts/hashes.
Sufficient RAM headroom.
Controllable parallelism.
Stable execution plans.
The bottleneck is confirmed to be spilling to temporary files, not mis‑estimated row counts or flawed query structure.
Otherwise, raising the parameter is a gamble that can turn a single slow query into a cluster‑wide outage.
Bottom line
work_memis not a performance switch; it is a resource lever. Used correctly, it saves work; used incorrectly, it can crash the database.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
ITPUB
Official ITPUB account sharing technical insights, community news, and exciting events.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
