Understanding Linux Memory Mapping (mmap): API, Implementation, and Use Cases
This article explains Linux memory mapping (mmap), covering its purpose, API parameters, different mapping types, internal kernel implementation, page‑fault handling, copy‑on‑write semantics, practical use cases, and includes a complete Objective‑C example demonstrating file mapping and manipulation.
Overview
Memory mapping is an OS technique that maps a file or device directly into a process's address space, allowing the process to read and write the data as if it were regular memory. It eliminates explicit read/write calls and keeps the mapped region synchronized with the underlying file.
Typical scenarios include handling large files, inter‑process communication, and improving I/O efficiency in network programming.
1. mmap API
void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);The function creates a new mapping and returns the starting virtual address. If addr is non‑NULL, the mapping starts at that address; otherwise the kernel chooses a free region.
length specifies the size of the region. prot defines read/write/execute permissions. fd determines whether the mapping is file‑backed (fd > 0) or anonymous (fd = -1). flags indicate sharing mode (e.g., MAP_SHARED or MAP_PRIVATE ) and other attributes.
Combining fd and flags yields four mapping types: shared file, private file, shared anonymous, and private anonymous.
2. Implementation Details
The mmap workflow consists of three main steps:
Obtain an unmapped virtual area with get_unmapped_area .
Set appropriate vm_flags based on file‑backed vs. anonymous and shared vs. private.
Call mmap_region to allocate a vm_area_struct (VMA) and link it into the process's red‑black tree of VMAs.
The kernel does not allocate physical pages at this point; it only records the process's demand for memory. Actual pages are provided lazily on a page‑fault.
3. Page‑Fault Handling
When a process accesses an unmapped page, the CPU raises a page‑fault and the kernel enters do_page_fault . It locates the relevant VMA, checks access permissions, and then calls handle_mm_fault , which eventually invokes handle_pte_fault .
handle_pte_fault distinguishes several cases:
If the PTE is not present and is pte_none , it handles anonymous pages ( do_anonymous_page ) or file‑backed pages ( do_linear_fault ).
If the PTE encodes a swap entry, it calls do_swap_page to swap the page back in.
If the fault is caused by a write to a read‑only COW page, it triggers do_wp_page to perform copy‑on‑write.
For file‑backed mappings, the VMA’s vm_ops is set to generic_file_vm_ops , whose fault method points to filemap_fault . This routine loads the required file data into memory.
4. Copy‑On‑Write (COW)
During fork , the parent and child share the same physical pages, which are marked read‑only. When either process writes to a shared page, a page‑fault occurs, and do_wp_page creates a private copy, allowing the processes to diverge.
5. Example Code (Objective‑C)
// ViewController.m
// TestCode
// Created by zhangdasen on 2020/5/24.
#import "ViewController.h"
#import
#import
@interface ViewController ()
@end
@implementation ViewController
- (void)viewDidLoad {
[super viewDidLoad];
NSString *path = [NSHomeDirectory() stringByAppendingPathComponent:@"test.data"];
NSLog(@"path: %@", path);
NSString *str = @"test str2";
[str writeToFile:path atomically:YES encoding:NSUTF8StringEncoding error:nil];
ProcessFile(path.UTF8String);
NSString *result = [NSString stringWithContentsOfFile:path encoding:NSUTF8StringEncoding error:nil];
NSLog(@"result:%@", result);
}
int MapFile(const char *inPathName, void **outDataPtr, size_t *outDataLength, size_t appendSize) {
int outError = 0;
int fileDescriptor;
struct stat statInfo;
*outDataPtr = NULL;
*outDataLength = 0;
fileDescriptor = open(inPathName, O_RDWR, 0);
if (fileDescriptor < 0) {
outError = errno;
} else {
if (fstat(fileDescriptor, &statInfo) != 0) {
outError = errno;
} else {
ftruncate(fileDescriptor, statInfo.st_size + appendSize);
fsync(fileDescriptor);
*outDataPtr = mmap(NULL, statInfo.st_size + appendSize,
PROT_READ|PROT_WRITE,
MAP_FILE|MAP_SHARED,
fileDescriptor, 0);
if (*outDataPtr == MAP_FAILED) {
outError = errno;
} else {
*outDataLength = statInfo.st_size;
}
}
close(fileDescriptor);
}
return outError;
}
void ProcessFile(const char *inPathName) {
size_t dataLength;
void *dataPtr;
char *appendStr = " append_key2";
int appendSize = (int)strlen(appendStr);
if (MapFile(inPathName, &dataPtr, &dataLength, appendSize) == 0) {
dataPtr = dataPtr + dataLength;
memcpy(dataPtr, appendStr, appendSize);
munmap(dataPtr, appendSize + dataLength);
}
}
@endThe example demonstrates mapping a file, appending data via the mapped region, and then unmapping.
6. Kernel Data Structures Involved
Key structures include file , dentry , inode , and address_space . The inode’s i_mapping points to an address_space that holds a radix tree of page objects, forming the PageCache. Shared mappings use the same physical pages via these structures.
Swap management is represented by struct swap_info_struct , which tracks swap devices, slot counts, and usage maps. A swap entry ( swp_entry_t ) encodes the swap device index and offset, allowing the kernel to retrieve swapped‑out pages.
Conclusion
Linux’s mmap mechanism provides a powerful, lazy‑loaded way to access file or anonymous memory, enabling efficient I/O, inter‑process communication, and memory‑conserving techniques such as copy‑on‑write. Understanding the API, kernel pathways, and page‑fault handling is essential for systems programmers and performance‑critical application developers.
Deepin Linux
Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.