Fundamentals 34 min read

Understanding Linux Memory Mapping (mmap): Concepts, APIs, and Source‑Code Analysis

This article explains the fundamentals of memory‑mapped files in Linux, covering the mmap/munmap interfaces, mapping types, protection flags, error handling, virtual‑memory structures, kernel implementation details, driver examples, and practical use‑cases such as shared memory and high‑performance I/O.

Deepin Linux

Nov 27, 2023

Understanding Linux Memory Mapping (mmap): Concepts, APIs, and Source‑Code Analysis

Memory mapping is an operating‑system technique that maps a file’s contents into a process’s virtual address space, allowing the process to read and write the file directly via pointers without invoking traditional read/write system calls.

Two mapping types exist: file‑backed mapping, which maps a specific region of a file, and anonymous mapping, which maps physical memory without an underlying file. The mapping is performed in page‑size units (typically 4 KB) and requires page‑aligned addresses.

The primary system calls are

#include <sys/mman.h>
void* mmap(void* start, size_t length, int prot, int flags, int fd, off_t offset);
int munmap(void *addr, size_t length);

. Parameters include the start address, length, protection flags (PROT_READ, PROT_WRITE, PROT_EXEC, PROT_NONE), and mapping flags (MAP_SHARED, MAP_PRIVATE, MAP_ANONYMOUS, MAP_FIXED, MAP_POPULATE, etc.). Successful mmap() returns a pointer to the mapped region; failure returns MAP_FAILED. munmap() returns 0 on success and –1 on error.

When a mapping is read‑only and a write occurs, the kernel raises SIGSEGV. Access beyond the file’s size but within the mapped page triggers SIGBUS. These signals can be caught with custom handlers.

Linux manages virtual memory per process using two key structures: struct mm_struct (the memory descriptor) and struct vm_area_struct (the virtual memory area). The mm_struct holds overall address‑space information, while each vm_area_struct describes a contiguous region with its start/end addresses, protection flags, and optional file backing.

The core implementation resides in do_mmap(), which validates parameters, selects an unmapped address range, constructs a vm_area_struct, and links it into the process’s VMA list. mmap_region() then creates the actual VMA, handling file references, shared‑write checks, and accounting. For file‑backed mappings, the file system’s mmap method (e.g., ext4_file_mmap()) sets up page‑fault handlers such as .fault, .map_pages, and .page_mkwrite.

Anonymous mappings use MAP_ANONYMOUS and are backed by /dev/zero. Shared anonymous mappings can be swapped, while private mappings employ copy‑on‑write.

Practical examples include a character‑device driver that allocates a kernel buffer and exposes it via remap_pfn_range(), and user‑space programs that mmap a regular file, demonstrate SIGSEGV/SIGBUS handling, and show how writes beyond the mapped length affect only the in‑memory page while the file size remains unchanged.

Typical application scenarios are inter‑process communication through shared memory, high‑throughput file I/O by eliminating extra copies, and efficient handling of large data sets such as databases or media processing.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

kernel Linux mmap memory-mapping shared-memory virtual-memory

Written by

Deepin Linux

Research areas: Windows & Linux platforms, C/C++ backend development, embedded systems and Linux kernel, etc.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.