Fundamentals 15 min read

Understanding Linux Kernel Readahead: Concepts, Benefits, Drawbacks, and Code Analysis

This article explains the design background, performance benefits, potential drawbacks, synchronous and asynchronous mechanisms, key data structures, operational principles, illustrative examples, and critical code paths of Linux kernel file readahead, providing a comprehensive technical overview for developers and system engineers.

Coolpad Technology Team

Oct 28, 2022

Understanding Linux Kernel Readahead: Concepts, Benefits, Drawbacks, and Code Analysis

Design Background – File access is often sequential; after reading a range [A, B], the next likely range is [B+1, B+N]. Prefetching the next range into RAM reduces costly disk I/O and improves performance.

Benefits of Readahead

Eliminates expensive disk I/O by reading data into the page cache ahead of time.

Improves storage stack and device processing efficiency by merging consecutive I/O requests.

Reduces load from hard and soft interrupts.

Prevents storage stack congestion that can delay read responses.

Minimizes mechanical head movement on HDDs by keeping accesses sequential.

Drawbacks of Readahead

For random reads, prefetched data may never be used, wasting bandwidth and memory.

Excessive prefetching can increase I/O load.

Large prefetch windows can raise memory pressure.

Synchronous vs Asynchronous Readahead

Synchronous readahead reads multiple pages (some for immediate read, some for future use) and returns immediately after submitting a BIO, without waiting for pages to become up‑to‑date. Asynchronous readahead reads pages solely for future use; the triggering read does not need the data at that moment.

Key Data Structure

/* Track a single file's readahead state */
struct file_ra_state {
    pgoff_t start;          /* where readahead started */
    unsigned int size;      /* # of readahead pages */
    unsigned int async_size;/* launch async readahead when only this many pages remain */
    unsigned int ra_pages;  /* maximum readahead window */
    unsigned int mmap_miss; /* cache miss stat for mmap accesses */
    loff_t prev_pos;        /* last read() position */
};

The size field represents the current window size, while async_size indicates how many pages remain before triggering asynchronous readahead.

Operational Principle

When a read requests N pages, the kernel may prefetch M pages (M > N). A PageReadahead marker is set on one of the prefetched pages; encountering this marker signals that the readahead window is depleting and a new asynchronous prefetch should be launched.

If the access pattern is sequential, the window is expanded (typically 2× or 4×, capped by ra_pages) and additional pages are fetched. For random accesses, only the pages required by the read are fetched, and the readahead window is not altered.

Example Walkthrough

The article walks through a 4 KB sequential read of pages 0‑7, followed by a random seek to page 108, showing how the readahead window is created, expanded, and eventually bypassed for random reads. Diagrams illustrate the window size and the position of the PageReadahead marker at each step.

Critical Code Analysis

The core functions involved are: generic_file_buffered_read → page_cache_sync_readahead (synchronous) generic_file_buffered_read → page_cache_async_readahead (asynchronous)

Both call ondemand_readahead, which computes the prefetch size and invokes ra_submit.

ondemand_readahead(struct address_space *mapping,
       struct file_ra_state *ra, struct file *filp,
       bool hit_readahead_marker, pgoff_t offset,
       unsigned long req_size)
{
    struct backing_dev_info *bdi = inode_to_bdi(mapping->host);
    unsigned long max_pages = ra->ra_pages;
    unsigned long add_pages;
    pgoff_t prev_offset;
    /* ... logic to decide sequential vs random, adjust ra->size, ra->async_size ... */
    return ra_submit(ra, mapping, filp);
}

The function expands the readahead window when the offset matches the expected sequential pattern or when a readahead marker is hit. It also handles initial reads, oversize reads, and fallback to simple on‑demand reads for random access.

Optimization Notes

The article notes a kernel bug where pages added by a failed prefetch remain in the cache marked as PageError, causing subsequent reads to fall back to single‑page reads and degrade performance. This was fixed in kernel 5.18 (see LKML link).

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Linux file system page cache I/O performance readahead

Written by

Coolpad Technology Team

Committed to advancing technology and supporting innovators. The Coolpad Technology Team regularly shares forward‑looking insights, product updates, and tech news. Tech experts are welcome to join; everyone is invited to follow us.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.