Understanding Linux I/O, Zero‑Copy, DMA, Direct I/O and mmap
This article explains the principles of Linux I/O, the need for zero‑copy, the role of page and buffer caches, various data‑transfer mechanisms such as DMA, Direct I/O and mmap, and compares their performance and usage scenarios.
1. Overview
Zero‑copy refers to techniques that reduce CPU‑side data copying; to understand why copying is needed, one must first grasp the I/O data flow and the layered structure of user space, kernel space and physical devices.
2. Division of Linux I/O
Linux I/O consists of two main parts: disk I/O and network I/O. Storage is further divided vertically into three layers: user space, kernel space and the physical device.
3. Disk I/O
3.1 Buffered I/O
When a process reads, data is copied from the disk to kernel space and then to user space – this is standard (buffered) I/O. The kernel caches frequently accessed data in the page cache to avoid repeated disk accesses.
3.2 Page cache vs. Buffer cache
Historically Linux had two caches: the page cache (file‑level, usually 4 KB or 8 KB pages) built on the filesystem, and the buffer cache (block‑level, typically 1 KB) built on the block device. Modern kernels merge them, keeping only the page cache.
3.3 Consistency and safety
Data can be lost if a process crashes while data resides in application or C library caches, if the kernel crashes before a sync operation, or even after sync if the disk’s write cache has not been flushed.
3.4 Disk cache
Disks contain their own RAM cache, split into read and write caches. Write‑through mode disables the cache so that data is written to the platter before the controller acknowledges success.
4. Data‑transfer mechanisms (DMA)
Linux provides four mechanisms for moving data between disk and main memory:
Programmed I/O (PIO) – the CPU polls the I/O port, leading to low CPU utilization.
Interrupt‑driven I/O – the device interrupts the CPU for each transferred byte, still consuming CPU cycles.
DMA – a DMA controller moves whole blocks of data with only start/end interrupts, greatly reducing CPU work.
Channel I/O – an advanced form of DMA that can control multiple devices in parallel.
Figures illustrating the data‑flow and timing of these mechanisms are included in the original article.
5. Direct I/O
Since kernel 2.6, Direct I/O (O_DIRECT) bypasses the page cache; data is transferred directly between user buffers and the disk, requiring buffer alignment to the device’s logical block size.
5.1 Advantages
Reduces CPU overhead and memory bandwidth usage, which can significantly improve performance for certain workloads.
5.2 Disadvantages
Improper use can degrade performance due to alignment issues and increased seek time; the overhead of managing aligned buffers can be high.
6. mmap
Memory‑mapped files map a file (or other object) into a process’s virtual address space, allowing the program to access file contents as ordinary memory without explicit read / write calls.
The mapping must be page‑aligned and its size a multiple of the page size; the kernel handles page‑fault handling and physical page allocation on first access.
6.1 Types of mapping
Shared mappings reflect changes across all processes, while private mappings use copy‑on‑write (COW) to keep modifications isolated.
6.2 Performance
mmap reduces copies between kernel and user space but incurs page‑fault handling and page‑table updates; on modern hardware, the reduced copy cost may be outweighed by these overheads.
7. Summary
The article uses a marriage analogy to compare buffered I/O (wife fetching money for husband), Direct I/O (wife fetching money and handing it directly), and mmap (wife allowing husband to take money directly from her wallet). It also contrasts CPU‑copy with DMA‑copy, illustrating how DMA lets the bank deliver money without the wife’s involvement.
Qunar Tech Salon
Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.