Fundamentals 9 min read

Understanding the Linux File I/O Stack: VFS, Filesystem, Block Layer, and SCSI

This article explains the Linux file I/O stack by outlining its clear route from user space to hardware, detailing the roles of VFS, the filesystem, the block layer, and the SCSI driver, and then dives into page cache mechanisms and code paths for both buffered and direct I/O.

Sohu Tech Products
Sohu Tech Products
Sohu Tech Products
Understanding the Linux File I/O Stack: VFS, Filesystem, Block Layer, and SCSI

The article introduces how to describe the file I/O stack in three steps: define a clear I/O route, understand the purpose of each node, and explore the kernel call code path.

The kernel I/O path from user space to hardware follows VFS → Filesystem → Block layer → SCSI layer . All file and network I/O in Linux ultimately passes through the VFS layer.

VFS layer : Provides a generic abstraction for files, exposing common structures such as file , inode , and dentry and defining unified APIs that specific filesystems implement.

Filesystem layer : Implements the actual storage strategy, mapping the abstract file concept to physical block devices and handling data placement using interfaces like address_space_operations .

Block layer : Abstracts hardware block devices, implements I/O scheduling (e.g., CFQ, Deadline, NOOP) and optimizations such as request merging and elevator algorithms.

SCSI layer : Acts as the final translator to the disk hardware, converting block I/O requests into device-specific commands.

The article then discusses the page cache, showing the address_space_operations struct with callbacks for buffered write, read, and writeback, and provides a minimal Minix filesystem example implementing .write_begin , .write_end , .writepage , and .readpage .

A typical write system call follows this stack: SYSCALL_DEFINE3(write) → vfs_write → do_sync_write → generic_file_aio_write → generic_file_buffered_write → generic_perform_write , where generic_perform_write allocates a page, copies user data, and marks the page dirty.

The mapping from file offset to block address is performed by a function like minix_get_block , which creates buffer heads linking pages to physical blocks.

Dirty page writeback can be triggered by time, amount of dirty data, or an explicit sync call, and is handled by kworker threads invoking filesystem .write_page or .write_pages callbacks.

For Direct I/O, the path skips the page cache: SYSCALL_DEFINE3(write) → vfs_write → do_sync_write → generic_file_aio_write → generic_file_direct_write → direct_IO , requiring filesystem support.

Finally, the article summarizes the stack and key points in a bullet list, reinforcing the roles of each layer and the mechanisms for buffered and direct I/O.

kernelLinuxpage cacheVFSFilesystemI/O Stack
Sohu Tech Products
Written by

Sohu Tech Products

A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.