Fundamentals 11 min read

Why cp Copies a 100GB File Instantly: Sparse Files and Inode Basics

An unexpected fast copy of a 100 GB file using the cp command reveals the concept of sparse files, where the logical size differs from physical disk usage, and explains how file systems employ inodes, block allocation, and multi‑level indexing to manage storage efficiently.

Efficient Ops
Efficient Ops
Efficient Ops
Why cp Copies a 100GB File Instantly: Sparse Files and Inode Basics

cp Triggered Thoughts

A colleague was shocked that copying a 100 GB file with

cp

finished in less than a second. The

ls -lh

command confirmed the file size, but the copy speed seemed impossible for a typical SATA disk.

<code># ls -lh
-rw-r--r-- 1 root root 100G Mar  6 12:22 test.txt</code>

Timing the copy showed:

<code># time cp ./test.txt ./test.txt.cp

real 0m0.107s
user 0m0.008s
sys 0m0.085s</code>

A SATA drive can write at about 150 MB/s, so copying 100 GB should take roughly 11 minutes, not a fraction of a second.

Running

du -sh ./test.txt

reported only 2 M, indicating the apparent size does not reflect actual disk usage.

<code># du -sh ./test.txt
2.0M ./test.txt</code>

The

stat

command showed:

<code># stat ./test.txt
  File: ./test.txt
  Size: 107374182400 Blocks: 4096   IO Block: 4096 regular file
  ...</code>

Key observations:

Size is the logical file size (what users see).

Blocks represent the actual disk space allocated.

This discrepancy led to the discussion of file systems.

File System Basics

A file system is simply a container for storing data, analogous to a luggage storage service: the file name is the label, metadata is the tag, the file itself is the luggage, the storage room is the disk, and the overall management mechanism is the file system.

Space management involves dividing the disk into fixed‑size blocks (typically 4 KB). Data is stored in these blocks, and an

inode

records which blocks belong to a file.

inode / block concepts

Inodes contain metadata and an array of block pointers. Direct pointers store up to 12 block numbers (≈48 KB). Larger files use indirect pointers:

Direct index (12 pointers)

Single indirect (points to a block that holds more pointers)

Double indirect

Triple indirect

Capacity calculations:

Direct: 12 × 4 KB = 48 KB

Single indirect: 1024 × 4 KB ≈ 4 MB

Double indirect: 1024 × 4 MB ≈ 4 GB

Triple indirect: 1024 × 4 GB ≈ 4 TB

Thus a typical ext2 file system can address up to about 4 TB.

Why cp Is So Fast

The observed file is a sparse file: its logical size is 1 TB + 4 KB, but only two 4 KB blocks contain actual data (total 8 KB). Unwritten regions do not allocate physical blocks.

When copying such a file,

cp

only copies the allocated blocks, so the operation finishes quickly.

Key point: The file size stored in the inode is just a metadata attribute; actual disk usage depends on the number of allocated blocks.

Summary

File systems achieve efficient storage by:

Dividing the disk into fixed‑size blocks.

Using inodes to map a file to its blocks.

Allocating blocks lazily, allowing sparse files where logical size exceeds physical usage.

This three‑step approach explains why copying a seemingly huge file can be instantaneous.

linuxstoragefile systemInodesparse file
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.