Understanding Linux File System: Disk Structure, Inodes, and Performance Optimizations
This article explains the physical structure of mechanical disks, partitioning strategies, inode and block allocation, and how filename length and directory layout affect Linux file system performance, providing practical experiments and tips for improving disk I/O.
I previously wrote an article titled "Linux File System Ten Questions" in 2012, which is now unavailable on the official site, so I am reposting the full content here.
The article is organized around ten common questions about Linux file systems, covering topics such as random‑read performance tricks, space usage of empty files and directories, where filenames are stored, maximum filename length, impact of long names on performance, maximum number of files per directory, actual disk space taken by a 1 KB file, the real amount of data read when requesting only 2 bytes, and ways to improve disk I/O speed.
1. Disk Composition and Partitioning
We first discuss the physical structure of a mechanical hard disk, which consists of platters (disk faces), heads, tracks, cylinders, and sectors. The article includes a diagram of these components and shows how the fdisk command can reveal the number of heads, cylinders, sectors per track, and sector size (e.g., 255 heads, 3263 cylinders, 63 sectors/track, 512‑byte sectors), resulting in a calculated capacity of about 26.8 GB.
Two partitioning schemes are compared: (1) dividing the disk by platters and (2) dividing by cylinders. The second scheme reduces seek time because the head moves within a smaller range of tracks, which is why modern operating systems adopt cylinder‑based partitioning. The article explains the three steps of a disk I/O operation (seek, rotation, transfer) and gives the formula: IO time = seek time + rotation latency + transfer time . It also provides typical values for a 10 k RPM disk (rotation latency up to 6 ms, seek time 3‑15 ms).
Images of fdisk output illustrate that the OS indeed uses the cylinder‑based approach.
2. Directories and Files
Creating an empty directory and an empty file with ls -l shows that the directory occupies 4096 bytes while the file shows 0 bytes. The article asks why the directory takes 4 KB, why an empty file shows 0 bytes, and where the metadata (name, permissions, etc.) is stored.
Running df -i before and after creating an empty file reveals that an inode is consumed. Using dumpe2fs shows that each inode is 256 bytes on this system, confirming that an empty file actually occupies one inode (256 bytes) even though its data block size is 0.
Similarly, a newly created empty directory consumes one inode plus a 4 KB block (the block size of the filesystem). Experiments with creating many files of different name lengths demonstrate that the directory block stores filenames, and the number of blocks needed grows with the total length of filenames, not linearly with the number of files.
The article notes that Linux limits filenames to 255 bytes, and that having many small files in a single directory can degrade performance because the OS must read multiple directory blocks during operations like ls . It recommends keeping the number of files per directory below ten thousand.
Creating a file that contains only a single space shows that the OS allocates a full 4 KB data block for any non‑empty file. Consequently, reading just 2 bytes still triggers a 4 KB block read, illustrating the principle of locality and why pre‑allocating space for large files can improve sequential I/O performance.
3. Closing Remarks
The author emphasizes that block size and inode size are set when formatting the disk and should be chosen based on workload: larger blocks for big files, smaller blocks for many tiny files. Monitoring inode usage is crucial because a filesystem can run out of inodes even when free space remains.
A thought question is posed about why compressing a directory before copying large numbers of small files speeds up the transfer.
Finally, the author reflects that deep knowledge of operating‑system internals (the "inner skill") remains valuable over time, whereas many high‑level development tools become obsolete quickly.
Refining Core Development Skills
Fei has over 10 years of development experience at Tencent and Sogou. Through this account, he shares his deep insights on performance.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.