Fundamentals 17 min read

Large Folios in the Linux Kernel: Benefits, Implementations, and Future Directions

Large folios in the Linux kernel combine multiple pages to reduce TLB misses, page faults, and reclamation cost while enabling more efficient compression; they are supported by filesystems like XFS and bcachefs, and recent patches add multi‑size THP, swap‑in/out handling, TAO allocation, NUMA balancing, and debug tools, with OPPO’s production deployment showing performance gains and motivating broader adoption and fragmentation mitigation.

OPPO Kernel Craftsman
OPPO Kernel Craftsman
OPPO Kernel Craftsman
Large Folios in the Linux Kernel: Benefits, Implementations, and Future Directions

In the Linux kernel, a folio can contain one or multiple pages; when it contains multiple pages it is called a large folio (or large page). Using large folios brings several benefits: reduced TLB misses (e.g., PMD mapping for 2 MiB or contiguous PTE mapping on ARM64), fewer page faults (e.g., do_anonymous_page can map a large folio and avoid faults on the remaining pages), lower LRU scale and reclamation cost (large folios are reclaimed as a unit, reducing reverse‑mapping overhead), and opportunities for larger‑granularity compression in zRAM/zsmalloc, which lowers CPU utilization and improves compression ratio.

File‑system support for large folios includes afs, bcachefs, erofs (non‑compressed), and xfs, which indicate their capability via mapping_set_large_folios() so the page cache can allocate large folios to fill the xarray when mapping_large_folio_support() returns true.

For anonymous pages, several patch series have been contributed: Ryan Roberts (ARM) introduced multi‑size THP (mTHP) allowing allocation of various sized large folios on fault, Transparent Contiguous PTEs for ARM64 to let 16 contiguous PTEs use a single TLB entry via the CONT bit, a swap‑out mTHP patch that avoids splitting large folios during reclaim (unless already partially unmapped), and a swap‑in large folio patch from OPPO (Chuanhua Han, Barry Song) that enables direct large‑folio swap‑in to preserve mTHP benefits on swap‑heavy Android/embedded workloads.

Additional works cover mTHP‑friendly compression in zsmalloc/zram (Tangquan Zheng), the TAO allocator optimization (Yu Zhao) that abstracts memory into 4 KB and large‑folio zones to improve allocation and compaction, per‑order mTHP allocation and swap‑out counters (Barry Song), a debugfs interface to split a folio to any lower order (Zi Yan), NUMA‑balancing support for multi‑size THP (Baolin Wang), and an enhancement to MADV_FREE/lazyfreeing that avoids splitting folios (Lance Yang).

The article also notes OPPO’s deployment of dynamic large pages (mainly 64 KiB leveraging CONT‑PTE) in production kernels since 2023, showing performance and user‑experience gains, and outlines future directions: broader file‑system support, reliable allocation guarantees similar to TAO, mainline swap‑in support, hardware‑offload compression, zswap large‑folio support, swap‑fragmentation solutions, balancing performance gains against memory fragmentation, and handling user‑space partial unmapping of large folios.

Code example showing the new per‑order mTHP stats files:

anon_alloc anon_alloc_fallback anon_swpout anon_swpout_fallback

Memory ManagementNUMATLBlarge foliosLinux kernelmTHPswapzRAM
OPPO Kernel Craftsman
Written by

OPPO Kernel Craftsman

Sharing Linux kernel-related cutting-edge technology, technical articles, technical news, and curated tutorials

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.