Understanding Linux Kernel Readahead: Concepts, Benefits, Drawbacks, and Code Analysis
This article explains the design background, performance benefits, potential drawbacks, synchronous and asynchronous mechanisms, key data structures, operational principles, illustrative examples, and critical code paths of Linux kernel file readahead, providing a comprehensive technical overview for developers and system engineers.
Design Background – File access is often sequential; after reading a range [A, B], the next likely range is [B+1, B+N]. Prefetching the next range into RAM reduces costly disk I/O and improves performance.
Benefits of Readahead
Eliminates expensive disk I/O by reading data into the page cache ahead of time.
Improves storage stack and device processing efficiency by merging consecutive I/O requests.
Reduces load from hard and soft interrupts.
Prevents storage stack congestion that can delay read responses.
Minimizes mechanical head movement on HDDs by keeping accesses sequential.
Drawbacks of Readahead
For random reads, prefetched data may never be used, wasting bandwidth and memory.
Excessive prefetching can increase I/O load.
Large prefetch windows can raise memory pressure.
Synchronous vs Asynchronous Readahead
Synchronous readahead reads multiple pages (some for immediate read, some for future use) and returns immediately after submitting a BIO, without waiting for pages to become up‑to‑date. Asynchronous readahead reads pages solely for future use; the triggering read does not need the data at that moment.
Key Data Structure
/* Track a single file's readahead state */
struct file_ra_state {
pgoff_t start; /* where readahead started */
unsigned int size; /* # of readahead pages */
unsigned int async_size;/* launch async readahead when only this many pages remain */
unsigned int ra_pages; /* maximum readahead window */
unsigned int mmap_miss; /* cache miss stat for mmap accesses */
loff_t prev_pos; /* last read() position */
};The size field represents the current window size, while async_size indicates how many pages remain before triggering asynchronous readahead.
Operational Principle
When a read requests N pages, the kernel may prefetch M pages (M > N). A PageReadahead marker is set on one of the prefetched pages; encountering this marker signals that the readahead window is depleting and a new asynchronous prefetch should be launched.
If the access pattern is sequential, the window is expanded (typically 2× or 4×, capped by ra_pages ) and additional pages are fetched. For random accesses, only the pages required by the read are fetched, and the readahead window is not altered.
Example Walkthrough
The article walks through a 4 KB sequential read of pages 0‑7, followed by a random seek to page 108, showing how the readahead window is created, expanded, and eventually bypassed for random reads. Diagrams illustrate the window size and the position of the PageReadahead marker at each step.
Critical Code Analysis
The core functions involved are:
generic_file_buffered_read → page_cache_sync_readahead (synchronous)
generic_file_buffered_read → page_cache_async_readahead (asynchronous)
Both call ondemand_readahead , which computes the prefetch size and invokes ra_submit .
ondemand_readahead(struct address_space *mapping,
struct file_ra_state *ra, struct file *filp,
bool hit_readahead_marker, pgoff_t offset,
unsigned long req_size)
{
struct backing_dev_info *bdi = inode_to_bdi(mapping->host);
unsigned long max_pages = ra->ra_pages;
unsigned long add_pages;
pgoff_t prev_offset;
/* ... logic to decide sequential vs random, adjust ra->size, ra->async_size ... */
return ra_submit(ra, mapping, filp);
}The function expands the readahead window when the offset matches the expected sequential pattern or when a readahead marker is hit. It also handles initial reads, oversize reads, and fallback to simple on‑demand reads for random access.
Optimization Notes
The article notes a kernel bug where pages added by a failed prefetch remain in the cache marked as PageError , causing subsequent reads to fall back to single‑page reads and degrade performance. This was fixed in kernel 5.18 (see LKML link).
Coolpad Technology Team
Committed to advancing technology and supporting innovators. The Coolpad Technology Team regularly shares forward‑looking insights, product updates, and tech news. Tech experts are welcome to join; everyone is invited to follow us.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.