Linux Shared Memory (shmem) Deep Dive: Architecture, Implementation, and Practice
Linux’s shmem subsystem provides hybrid anonymous/file‑backed pages that enable diverse shared‑memory scenarios—parent‑child communication, IPC, tmpfs, Android ashmem, and memfd—by using APIs such as shmem_file_setup, handling page faults through cache and swap mechanisms, and employing a specialized reclamation process to manage memory efficiently.
In Linux, process virtual address space consists of kernel space and user space. While kernel space is shared across processes, user spaces are isolated. Linux provides shared memory mechanisms to enable inter-process data sharing. This article explores the Linux shared memory (shmem) subsystem, examining how shmem pages bridge the gap between anonymous pages and file-backed pages.
Application Scenarios: shmem is used in multiple scenarios: (1) Shared anonymous mapping for parent-child process communication; (2) IPC shared memory for arbitrary process sharing; (3) tmpfs for in-memory filesystem; (4) ashmem for Android anonymous shared memory; (5) memfd for creating shared anonymous files. Key APIs include shmem_kernel_file_setup, shmem_file_setup, shmem_zero_setup, and shmem_read_mapping_page_gfp.
Page Types and Characteristics: shmem pages uniquely combine features of both anonymous and file-backed pages. They have the PG_swapbacked flag like anonymous pages (enabling swap functionality), while also having inode->i_mapping->a_ops = &shmem_aops like file pages (associated with files, maintaining page cache). Despite these hybrid characteristics, shmem pages are added to the anonymous LRU list due to their swap capability.
Memory Sharing Principle: Using memfd as an example, the process involves: (1) Creating a file descriptor via memfd_create; (2) Passing the fd to other processes via Unix socket (different processes may have different fd numbers but point to the same file); (3) Mapping shared memory via mmap; (4) On first write access, page fault occurs, physical page is allocated and added to page cache, then mapped to process virtual address space; (5) On subsequent read by another process, page fault finds the existing page in page cache and maps it, achieving memory sharing.
Page Fault Handling: The workflow follows: (1) Search page cache; (2) Search swap cache; (3) If previously swapped out, swap in from swap device; (4) Otherwise allocate new folio; (5) Add to anonymous LRU; (6) Map to process page tables.
Page Reclamation: The reclamation process includes: isolating from LRU, acquiring page lock, removing all page table mappings via rmap, allocating swap entry and adding to swap cache, replacing page cache with swap entry, removing from swap cache, writing to swap device, releasing page lock, returning page to buddy system. Unlike anonymous pages which replace swap entries with original PTE entries, shmem clears PTEs and replaces them with swap cache positions.
tmpfs: A temporary filesystem with all data in memory (lost on power loss). Characteristics include: fast data access, ability to swap to swap devices (like zram) when memory is insufficient, and returning zeros for reads without allocating physical pages for unwritten regions.
OPPO Kernel Craftsman
Sharing Linux kernel-related cutting-edge technology, technical articles, technical news, and curated tutorials
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.