Understanding Page Fault Handling and Virtual Memory Management in the uCore Kernel
This article explains how Linux-like operating systems use virtual memory and the MMU to map virtual addresses to physical memory, describes the data structures (vma_struct and mm_struct) used by uCore, details the page‑fault handling flow, classifies fault types, and shows how these mechanisms affect system performance.
When a program accesses a page that is not currently resident in physical memory, a page‑fault exception is triggered; the operating system must handle the exception and load the required page from disk. The uCore kernel implements this crucial part of virtual memory management through the do_pgfault function.
Before diving into page faults, it is important to understand Linux memory management architecture. Processes do not access physical memory directly; instead, the Memory Management Unit (MMU) translates virtual addresses to physical addresses, allowing each process to believe it has a large, contiguous address space while the actual physical memory is limited.
In uCore, the virtual‑memory demand of a process is described by two key data structures:
struct vma_struct { struct mm_struct *vm_mm; uintptr_t vm_start; // start address of the VMA uintptr_t vm_end; // end address of the VMA uint32_t vm_flags; // permission flags list_entry_t list_link; // sorted by start address };
Each vma_struct represents a continuous virtual‑memory region, aligned to page size, with non‑overlapping address ranges. Permission flags are defined as:
#define VM_READ 0x00000001 // read‑only #define VM_WRITE 0x00000002 // read‑write #define VM_EXEC 0x00000004 // executable
The higher‑level structure mm_struct aggregates all VMAs belonging to a process and holds the page‑directory pointer:
struct mm_struct { list_entry_t mmap_list; // list of VMAs struct vma_struct *mmap_cache; // fast‑path cache pde_t *pgdir; // page‑directory table int map_count; // number of VMAs void *sm_priv; // swap manager private data };
During kernel initialization, functions such as pmm_init , vmm_init , ide_init , and swap_init set up physical‑memory management, virtual‑memory management, and the swap subsystem. The page‑fault handling path in uCore follows:
trap → trap_dispatch → pgfault_handler → do_pgfault
The do_pgfault routine examines the faulting address stored in the CR2 register and the error code to determine whether the address lies within a valid VMA and whether the access permissions are satisfied. If the address is valid but unmapped, a free physical page is allocated, the page table is updated, the TLB is flushed, and execution resumes at the faulting instruction. If the address is outside any VMA, the kernel treats it as an illegal access and terminates the process.
Page faults are classified into three categories:
Hard (major) page fault – the required page is not in memory and must be read from disk or swap.
Soft (minor) page fault – the page is already in memory (e.g., shared library or mmap) but the mapping is missing.
Invalid page fault – the address is illegal (e.g., null‑pointer dereference) and results in a SIGSEGV.
The handling logic branches accordingly: invalid addresses cause a segment‑fault termination; valid addresses trigger demand paging, swap‑in, or copy‑on‑write (COW) actions depending on the page’s state and permission flags.
Performance impact varies: hard faults involve costly disk I/O and can dramatically degrade throughput, while soft faults only require updating page‑table entries and are relatively cheap. Excessive hard faults, such as those caused by insufficient RAM in a database workload, can become a bottleneck.
An illustrative C program demonstrates demand paging: the first write to a malloc‑ed buffer triggers a page fault, prompting the kernel to allocate a physical page and establish the mapping.
#include #include int main() { int *ptr = (int *)malloc(1024 * sizeof(int)); // reserves virtual memory only if (ptr == NULL) { perror("malloc failed"); return 1; } // First access triggers a page fault ptr[0] = 100; printf("Value at ptr[0]: %d\n", ptr[0]); free(ptr); return 0; }
In summary, page‑fault handling is the cornerstone of on‑demand paging, swap management, and memory protection in modern operating systems, and understanding its mechanisms is essential for kernel developers and performance engineers.
Deepin Linux
Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.