Reverse Mapping (RMAP) Mechanism in the Linux Kernel: Concepts, Data Structures, and Implementation
The article explains Linux kernel reverse‑mapping (RMAP), describing its purpose for efficient page‑reclaim, the underlying data structures such as vm_area_struct, anon_vma and anon_vma_chain, and shows how anonymous and file pages are tracked and unmapped through detailed code examples.
Reverse mapping (RMAP) is a kernel mechanism that updates page‑table entries to enable fast reclamation of shared pages by maintaining links from a physical struct page back to the virtual memory areas (VMAs) that reference it.
Unlike forward mapping (virtual → physical), RMAP provides a physical‑to‑virtual mapping, allowing the kernel to locate all user‑space PTEs that map a given page, which is essential for page‑reclaim, migration, and KSM.
1. Overview
A physical page can be mapped by multiple processes, but a virtual page maps to only one physical page. When a page needs to be reclaimed, the kernel must find every process using it and break those mappings.
2. Core Data Structures
The main structures involved are:
VMA (vm_area_struct) – describes a contiguous region of a process’s address space.
struct vm_area_struct {
unsigned long vm_start;
unsigned long vm_end;
struct mm_struct *vm_mm;
pgprot_t vm_page_prot;
unsigned long vm_flags;
struct list_head anon_vma_chain;
struct anon_vma *anon_vma;
};anon_vma – connects a struct page to the VMAs that map it and contains a red‑black tree of anon_vma_chain entries.
// Simplified definition
struct anon_vma {
struct anon_vma *root;
struct rw_semaphore rwsem;
atomic_t refcount;
struct anon_vma *parent;
struct rb_root_cached rb_root;
};anon_vma_chain – links a VMA to its anon_vma and is stored both in the VMA’s list and in the anon_vma’s RB‑tree.
// Simplified definition
struct anon_vma_chain {
struct vm_area_struct *vma;
struct anon_vma *anon_vma;
struct list_head same_vma; /* locked by mmap_sem & page_table_lock */
struct rb_node rb; /* locked by anon_vma->rwsem */
};3. Anonymous Page Reverse Mapping
When a page fault creates an anonymous page, the kernel allocates an anon_vma for the page and inserts an anon_vma_chain into both the VMA’s list and the anon_vma’s RB‑tree. This creates a bidirectional link that later allows rmap_walk_anon() to enumerate all VMAs referencing the page.
// Example of page allocation and anon_vma linking (simplified)
static __latent_entropy int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm) {
...
tmp = vm_area_dup(mpnt);
tmp->vm_mm = mm;
if (anon_vma_fork(tmp, mpnt))
...
__vma_link_rb(mm, tmp, rb_link, rb_parent);
retval = copy_page_range(mm, oldmm, mpnt);
...
}During fork, anon_vma_fork() creates the necessary anon_vma_chain entries for the child’s VMAs, preserving the reverse‑mapping links.
4. RMAP Application
Typical kernel scenarios that use RMAP include:
kswapd page‑reclaim, which must unmap all user PTEs of an anonymous page.
Page migration, which also needs to break every user mapping.
The central function is try_to_unmap() , which walks the reverse‑mapping structures and calls try_to_unmap_one() for each VMA.
int try_to_unmap(struct page *page, enum ttu_flags flags) {
struct rmap_walk_control rwc = {
.rmap_one = try_to_unmap_one,
.arg = (void *)flags,
.done = page_not_mapped,
.anon_lock = page_lock_anon_vma_read,
};
int ret = rmap_walk(page, &rwc);
if (ret != SWAP_MLOCK && !page_mapped(page))
ret = SWAP_SUCCESS;
return ret;
}Depending on the page type, rmap_walk() dispatches to rmap_walk_anon() , rmap_walk_file() , or rmap_walk_ksm() .
5. Supplementary Topics
5.1 KSM Reverse Mapping
KSM (kernel shared memory) merges identical anonymous pages across processes. For KSM pages, the kernel uses struct rmap_item linked via stable_node to track all VMA references.
5.2 File Page Reverse Mapping
File‑backed pages use the file’s address_space (its i_mmap list) to locate VMAs. The virtual address of a file page can be computed as page->index - vma->vm_pgoff + vma->vm_start , enabling the kernel to unmap the corresponding PTEs.
Overall, RMAP provides the kernel with an efficient way to trace from a physical page back to every virtual mapping, which is crucial for memory‑reclaim, migration, and shared‑memory mechanisms.
Deepin Linux
Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.