Fundamentals 9 min read

Understanding Linux PageCache: How the OS Accelerates File Reads and Writes

PageCache, a kernel-managed memory cache that stores disk data in RAM, dramatically speeds up file operations by turning repeated reads and writes into pure memory accesses, and its dynamic sizing, read‑ahead, and LRU eviction are demonstrated through Linux experiments with large files.

IT Services Circle
IT Services Circle
IT Services Circle
Understanding Linux PageCache: How the OS Accelerates File Reads and Writes

Storage Media Performance Gap

Different storage devices in a computer have vastly different speeds; a mechanical hard drive can be hundreds of times slower than RAM for read/write operations. Directly reading data from disk therefore hurts responsiveness, so operating systems introduce the PageCache mechanism to bridge this gap.

PageCache works like CPU caches (L1/L2/L3) but is implemented in software at the OS level rather than hardware.

What Is PageCache?

PageCache consists of memory pages whose contents correspond to physical blocks on disk. When a process is not using memory for its own code or data, the free RAM can be allocated to PageCache, and its size grows or shrinks dynamically based on available memory.

Because it uses any idle memory, PageCache can expand to occupy all free RAM, and it can also shrink when the system needs to free memory.

File Read

When a process issues a read system call, the kernel first checks whether the requested data resides in PageCache. If it does, the kernel returns the data directly from memory – a cache hit. If not, a cache miss occurs, the kernel performs a disk I/O operation, reads the data, and fills the corresponding page in PageCache for future accesses.

Only the pages that are actually accessed are cached; for example, a four‑page file may have only the first page stored in PageCache if that page is the one frequently read.

File Write

When a process calls write , the kernel typically follows a write‑through policy: the data is first written to PageCache and marked dirty, while the actual disk write is deferred. This makes the write appear as a fast, pure‑memory operation, and the progress bar reflects the amount of data placed into PageCache rather than the real disk‑write progress.

Dirty pages are periodically flushed to disk by the kernel, synchronizing the in‑memory cache with the persistent storage.

Experimental Verification

On a Linux machine, we created a 1 GB file filled with random data:

dd if=/dev/urandom of=testfile bs=1M count=1024

Then we cleared all PageCache:

sync && echo 1 > /proc/sys/vm/drop_caches

Reading the file for the first time forces a disk I/O:

$ time cat testfile > /dev/null
real    0m6.176s
user    0m0.028s
sys     0m0.731s

The output shows that the read took about 6 seconds, dominated by disk latency. After this read, the file resides in PageCache. Reading it a second time yields:

$ time cat testfile > /dev/null
real    0m0.309s
user    0m0.011s
sys     0m0.298s

The second read completes in roughly 0.3 seconds, nearly 20× faster, because it is served entirely from memory.

Conclusion

PageCache provides a substantial performance boost for repeated file accesses and sequential reads of large files. It is essential in scenarios such as compiling code, reading configuration files, video editing, and processing database logs, where the same data is accessed multiple times.

PerformanceMemory ManagementLinuxFile I/OPageCache
IT Services Circle
Written by

IT Services Circle

Delivering cutting-edge internet insights and practical learning resources. We're a passionate and principled IT media platform.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.