Big Data 13 min read

Mastering Spark’s Unified Memory Management: A Deep Dive into On‑Heap & Off‑Heap Tuning

This article explains Spark's unified memory manager, detailing on‑heap and off‑heap memory regions, dynamic memory sharing, task memory allocation, and practical tuning techniques to optimize performance and avoid common out‑of‑memory errors.

Data Thinking Notes
Data Thinking Notes
Data Thinking Notes
Mastering Spark’s Unified Memory Management: A Deep Dive into On‑Heap & Off‑Heap Tuning

1. Spark Memory Model

1.1 Overview

Understanding Spark's memory management is essential for efficient resource allocation and tuning; it helps identify problematic memory regions without simply increasing memory size.

Versions prior to 1.6 used static memory management, while Spark 1.6 and later adopt a Unified Memory Manager . This article focuses on the unified approach.

The Spark UI "Executors" tab shows memory allocation for a task submitted in standalone client mode with the following configuration:

Command‑line options used:

--executor-memory 2g --driver-memory 1g --total-executor-cores 4

The unified memory manager comprises two main regions: On‑heap Memory and Off‑heap Memory .

1.2 On‑heap Memory

By default Spark uses only on‑heap memory, which is divided into four parts:

Execution Memory : stores temporary data for shuffle, join, sort, aggregation, etc.

Storage Memory : holds cached RDD data and unrolled data.

User Memory : keeps metadata such as RDD dependencies.

Reserved Memory : system‑reserved space for internal Spark objects.

Key memory parameters:

systemMemory = Runtime.getRuntime.maxMemory

(configured via

spark.executor.memory

or

--executor-memory

)

reservedMemory = 300MB

in Spark 2.4.3 (modifiable in testing with

spark.testing.reservedMemory

)

usableMemory = systemMemory - reservedMemory
unifiedMemory = usableMemory * 0.6

(default 60% share)

Minimum task memory =

reservedMemory * 1.5 = 450MB

; tasks requesting less will fail.

1.3 Off‑heap Memory

Since Spark 1.6, off‑heap memory can be enabled via

spark.memory.offHeap.enabled

and sized with

spark.memory.offHeap.size

. Off‑heap memory is allocated outside the JVM using unsafe APIs, avoiding GC overhead but requiring manual allocation and release logic.

When enabled, both on‑heap and off‑heap regions coexist, and Execution and Storage memory are the sum of their respective on‑heap and off‑heap parts.

1.4 Dynamic Memory Adjustment

Before Spark 1.6, Execution and Storage memory were statically partitioned; insufficient Execution memory could not borrow from free Storage memory. With the unified manager, the two regions can share space dynamically.

Implementation details:

Initial Allocation: set via

spark.memory.storageFraction

.

If both sides lack space, data is spilled to disk using an LRU policy.

When one side borrows space, the other may evict its blocks to disk and return the borrowed memory.

Storage side cannot currently return borrowed space due to shuffle complexities.

Borrowing only occurs between like‑type memories (both on‑heap or both off‑heap).

1.5 Task Memory Allocation

Tasks share Execution memory. Spark maintains a HashMap tracking each Task's memory usage. When a Task requests

numBytes

, Spark checks available Execution memory and updates the map accordingly.

Each Task must acquire at least

1/2N

of the total Execution memory (where

N

is the number of concurrent Tasks). The usable range per Task is

1/2N

1/N

. For example, with 10 GB Execution memory and 5 Tasks, each Task can request between 1 GB and 2 GB.

2. Spark Memory Tuning

2.1 Determine Memory Consumption

Create an RDD, cache it, and inspect the "Storage" tab in the Web UI to see its memory usage. Use

SizeEstimator.estimate

to estimate the size of specific objects, such as broadcast variables.

2.2 Optimize Data Structures

Reduce memory overhead by avoiding pointer‑heavy Java/Scala collections and using primitive arrays or specialized libraries like fastutil . Prefer flat structures, numeric or enum keys instead of strings, and enable

-XX:+UseCompressedOops

for JVMs with < 32 GB RAM.

2.3 Serialize RDD Storage

When RDDs are large, persist them with serialization (e.g.,

StorageLevels.MEMORY_ONLY_SER

) using Kryo for better efficiency. If OOM persists, consider

StorageLevels.MEMORY_AND_DISK

based on data size.

2.4 Adjust Parallelism

Set parallelism to roughly 2–3 times the total CPU cores. Tune

spark.default.parallelism

(effective during shuffle), use

rdd.repartition

to increase partitions, and configure

spark.sql.shuffle.partitions

(default 200) for SparkSQL.

2.5 Broadcast Variables

Convert large read‑only objects on the driver into broadcast variables so that Executors fetch them from the nearest BlockManager, reducing network traffic.

2.6 Use Map‑Side Pre‑Aggregation

Perform local aggregation on each node (e.g., using

reduceByKey

or

aggregateByKey

) to minimize data transferred during shuffle, instead of

groupByKey

.

2.7 GC Optimization

GC tuning involves many aspects and can be covered in a dedicated article.

3. Common Issues

Typical executor‑related errors include:

java.lang.OutOfMemoryError

ExecutorLostFailure

Executor exit code: 143

Heartbeat timeout

Shuffle file lost

Refer to the official Spark tuning guide and detailed memory management articles for further guidance.

Big Datamemory managementPerformance TuningSparkUnified Memory
Data Thinking Notes
Written by

Data Thinking Notes

Sharing insights on data architecture, governance, and middle platforms, exploring AI in data, and linking data with business scenarios.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.