Information Security 17 min read

Design and Implementation of the Android Scudo Hardened Allocator

The Android Scudo hardened allocator, introduced in Android R to replace jemalloc, uses a primary region‑based allocator, a secondary mmap‑backed allocator, thread‑specific caches, and a quarantine system with 64‑bit chunk headers and extensive safety checks, offering a balanced security‑performance trade‑off configurable via compile‑time, environment, and mallopt options.

OPPO Kernel Craftsman
OPPO Kernel Craftsman
OPPO Kernel Craftsman
Design and Implementation of the Android Scudo Hardened Allocator

Early Android versions used jemalloc as the default native memory allocator, but starting with Android R the Scudo allocator replaced jemalloc in the non‑svelte configuration mode (jemalloc remains the default in svelte mode).

With the widespread adoption of 64‑bit devices and large RAM, virtual and physical memory bottlenecks are relaxed, giving the system more flexibility to balance performance and security. Since memory‑related vulnerabilities account for more than half of all security issues, an allocator that can mitigate attacks dramatically reduces the overall number of security problems, which is why Scudo was introduced.

Scudo’s design aims for a good trade‑off between security and performance. Pure performance‑wise it does not necessarily outperform jemalloc; its simplified allocation strategy and additional safety checks can incur some overhead.

Scudo Components

Scudo consists of four main components: Primary, Secondary, TSD (Thread‑Specific Data), and Quarantine.

Primary Allocator: It divides a reserved memory region into equal‑sized blocks for fast allocation of small objects. Two primary allocators exist, one for 32‑bit and one for 64‑bit architectures, selectable at compile time.

On 64‑bit Android R/S, the Primary Allocator reserves 256 MiB × 33 regions (total 8.5 GiB). Each region is identified by a class‑id (0‑32) and is further subdivided into size classes (e.g., class 1 → 32 B, class 2 → 48 B, …, class 32 → 64 KiB). Small allocations first look for a free block in the appropriate region; if none is available they fall back to a larger region or to the Secondary Allocator.

The Primary Allocator also provides a per‑thread cache (SizeClassAllocatorLocalCache). Each thread’s cache holds a limited number of chunks (default 28). When the cache is exhausted, it refills from the region’s free list; if the free list is insufficient, the region expands.

Secondary Allocator: Slower than Primary, it obtains large blocks via mmap and surrounds them with guard pages.

On 64‑bit Android R/S, the Secondary Allocator handles allocations larger than 64 KiB, using a MapAllocatorCache that can hold up to 32 blocks (≤ 2 MiB each).

TSD: Defines how each thread’s local cache operates. Two models exist: exclusive (each thread has its own cache) and shared (threads share a fixed‑size pool). Android R/S uses the shared model with two TSD objects, each containing a SizeClassAllocatorLocalCache and a QuarantineCache.

Quarantine: Delays reuse of freed memory blocks to detect use‑after‑free (UAF). Blocks that meet size criteria are placed into a thread‑local QuarantineCache; if it overflows, they move to a global quarantine cache, and if that overflows they are recycled back to the Primary or Secondary allocator.

Chunk Header

The chunk header is 64 bits (8 bytes) and contains the following fields:

ClassId – region id for Primary allocations; 0 indicates allocation from Secondary.

State – 0 = Available, 1 = Allocated, 2 = Quarantined.

OriginOrWasZeroed – indicates the allocation method (new/malloc) when State = Allocated.

SizeOrUnusedBytes – size of the allocation for positive ClassId, otherwise unused bytes.

Offset – offset of the chunk header within the block.

Checksum – used to detect header corruption.

When a block is freed, a series of checks are performed in order:

Alignment check (16‑byte alignment on 64‑bit).

Checksum verification.

State validation (detect double‑free).

Allocation‑type match (malloc/free vs. new/delete).

Size verification (if DeleteSizeMismatch is enabled).

Typical error messages and their meanings:

corrupted chunk header – checksum mismatch or overwritten header.

race on chunk header – concurrent threads manipulate the same header.

invalid chunk state – operation performed on a block not in the expected state (e.g., double‑free).

misaligned pointer – pointer does not meet required alignment.

allocation type mismatch – mismatched malloc/free or new/delete.

invalid sized delete – size passed to delete does not match allocated size.

RSS limit exhausted – memory usage exceeds configured RSS limit.

Memory Allocation Flow

The allocation process proceeds as follows:

Compute NeededSize by aligning the requested size and adding the larger of alignment or chunk‑header size. If NeededSize exceeds the largest Primary size class, go to the Secondary Allocator (step 2); otherwise continue with Primary (step 4).

Secondary allocation: round NeededSize + LargeBlockHeaderSize up to page size → RoundedSize . If a block is available in MapAllocatorCache , use it (step 6); otherwise continue to step 3.

Map RoundedSize + 2 pages with PROT_NONE guard pages, then remap the middle region with read/write permissions, skip the LargeBlockHeader, and obtain the usable address.

Primary allocation: obtain the thread’s SizeClassAllocatorLocalCache and try to fetch a free block of the appropriate size class.

If the local cache is empty, refill it from the region’s free list (or move to a larger region, or fall back to Secondary if necessary).

After a free block is obtained, fill the chunk header and return the user pointer.

Memory Free Flow

The free process mirrors the allocation steps:

Retrieve the chunk header and perform checks (checksum, allocation‑type match, size mismatch, etc.).

If Quarantine is enabled and the block meets size criteria, place it into the thread‑local QuarantineCache; overflow moves blocks to the global quarantine cache, and further overflow recycles them back to Primary/Secondary.

If Quarantine is not used, free directly to Primary or Secondary based on the classid (0 → Secondary).

For Secondary blocks < 2 MiB, attempt to cache them in MapAllocatorCache ; if the cache is full, unmap the cached blocks.

For Primary blocks, return them to the thread’s SizeClassAllocatorLocalCache . If the cache is full, half of its contents are transferred back to the region’s free list, possibly triggering madvise to release unused pages.

Scudo Configuration Options

Scudo is highly tunable. Configuration can be supplied via:

Compile‑time definition of SCUDO_DEFAULT_OPTIONS .

Static function extern "C" const char *__scudo_default_options() returning an option string.

Environment variable SCUDO_OPTIONS at runtime.

Standard mallopt API with Scudo‑specific keys.

Example static configuration:

extern "C" const char *__scudo_default_options() { return "delete_size_mismatch=false:release_to_os_interval_ms=-1"; }

Example runtime configuration:

SCUDO_OPTIONS="delete_size_mismatch=false:release_to_os_interval_ms=-1" ./a.out

Typical mallopt keys and their meanings are documented in the official Scudo and LLVM documentation.

References:

https://llvm.org/docs/ScudoHardenedAllocator.html

https://source.android.com/devices/

https://zhuanlan.zhihu.com/p/235620563

https://juejin.cn/post/6914550038140026887

AndroidsecurityChunk Headermemory allocatorNative Allocationscudo
OPPO Kernel Craftsman
Written by

OPPO Kernel Craftsman

Sharing Linux kernel-related cutting-edge technology, technical articles, technical news, and curated tutorials

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.