Fundamentals 13 min read

How Ext4 Stores Data: Inodes, Extents, and Directory Indexes Explained

This article explains how the ext4 file system organizes data using inodes, direct and indirect blocks, extents, and directory indexing, detailing the structures, code definitions, and performance implications for both small and large files.

Ops Development Stories
Ops Development Stories
Ops Development Stories
How Ext4 Stores Data: Inodes, Extents, and Directory Indexes Explained

Data Placement in ext2/3 and ext4

In ext2 and ext3 the first 12 block pointers in an inode directly store block locations, accessible via

i_block[0-11]

. For larger files, indirect blocks are used:

i_block[12]

points to a block that stores further block numbers, with double and triple indirect blocks at

i_block[13]

and

i_block[14]

respectively.

Extents in ext4

Extents allow a large file (e.g., 128 MiB) to be stored as a contiguous range, reducing fragmentation and improving read/write performance. The following diagram illustrates an extent tree:

Inode Metadata

An inode stores metadata such as owner, permissions, timestamps, and pointers to data blocks. In ext4 the default inode size is 256 bytes. The structure of an ext4 inode is shown below:

<code>struct ext4_inode {
  __le16 i_mode;      /* File mode */
  __le16 i_uid;       /* Low 16 bits of Owner Uid */
  __le32 i_size_lo;   /* Size in bytes */
  __le32 i_atime;     /* Access time */
  __le32 i_ctime;     /* Inode Change time */
  __le32 i_mtime;     /* Modification time */
  __le32 i_dtime;     /* Deletion Time */
  __le16 i_gid;       /* Low 16 bits of Group Id */
  __le16 i_links_count;/* Links count */
  __le32 i_blocks_lo; /* Blocks count */
  __le32 i_flags;     /* File flags */
  union {
    struct { __le32 l_i_version; } linux1;
    struct { __u32 h_i_translator; } hurd1;
    struct { __u32 m_i_reserved1; } masix1;
  } osd1;              /* OS dependent 1 */
  __le32 i_block[EXT4_N_BLOCKS]; /* Pointers to blocks */
  __le32 i_generation;/* File version (for NFS) */
  __le32 i_file_acl_lo;/* File ACL */
  __le32 i_size_high;
  __le32 i_obso_faddr;/* Obsoleted fragment address */
  union {
    struct { __le16 l_i_blocks_high; __le16 l_i_file_acl_high; __le16 l_i_uid_high; __le16 l_i_gid_high; __le16 l_i_checksum_lo; __le16 l_i_reserved; } linux2;
    struct { __le16 h_i_reserved1; __u16 h_i_mode_high; __u16 h_i_uid_high; __u16 h_i_gid_high; __u32 h_i_author; } hurd2;
    struct { __le16 h_i_reserved1; __le16 m_i_file_acl_high; __u32 m_i_reserved2[2]; } masix2;
  } osd2;              /* OS dependent 2 */
  __le16 i_extra_isize;
  __le16 i_checksum_hi;/* crc32c(uuid+inum+inode) BE */
  __le32 i_ctime_extra;/* extra Change time (nsec<<2|epoch) */
  __le32 i_mtime_extra;/* extra Modification time */
  __le32 i_atime_extra;/* extra Access time */
  __le32 i_crtime;    /* File Creation time */
  __le32 i_crtime_extra;/* extra File Creation time */
  __le32 i_version_hi;/* high 32 bits for 64‑bit version */
  __le32 i_projid;    /* Project ID */
};</code>

Regular File Storage Format

The data block area stores file contents. The constants defining block pointers are:

<code>#define EXT4_NDIR_BLOCKS    12
#define EXT4_IND_BLOCK      EXT4_NDIR_BLOCKS
#define EXT4_DIND_BLOCK     (EXT4_IND_BLOCK + 1)
#define EXT4_TIND_BLOCK     (EXT4_DIND_BLOCK + 1)
#define EXT4_N_BLOCKS       (EXT4_TIND_BLOCK + 1)</code>

An inode can hold one

ext4_extent_header

and up to four

ext4_extent

entries. When the file is small,

eh_depth

is 0, meaning the inode itself is the leaf node. Larger files cause the extent tree to split, increasing

eh_depth

.

<code>struct ext4_extent_header {
  __le16 eh_magic;   /* magic number */
  __le16 eh_entries; /* number of valid entries */
  __le16 eh_max;     /* capacity of entries */
  __le16 eh_depth;   /* tree depth */
  __le32 eh_generation; /* generation of the tree */
};

struct ext4_extent {
  __le32 ee_block;   /* first logical block covered */
  __le16 ee_len;     /* number of blocks covered */
  __le16 ee_start_hi;/* high 16 bits of physical block */
  __le32 ee_start_lo;/* low 32 bits of physical block */
};

struct ext4_extent_idx {
  __le32 ei_block;   /* logical block covered */
  __le32 ei_leaf_lo; /* pointer to next level leaf */
  __le16 ei_leaf_hi; /* high 16 bits of leaf block */
  __u16  ei_unused;
};</code>

Directory and Filename Storage Format

Directories are also files with inodes. Their data blocks contain

ext4_dir_entry

(or

ext4_dir_entry_2

) structures, which map filenames to inode numbers. The two versions differ in how the name length and file type are stored.

<code>struct ext4_dir_entry {
  __le32 inode;   /* inode number */
  __le16 rec_len; /* entry length */
  __le16 name_len;/* name length */
  char   name[EXT4_NAME_LEN]; /* file name */
};

struct ext4_dir_entry_2 {
  __le32 inode;   /* inode number */
  __le16 rec_len; /* entry length */
  __u8   name_len;/* name length */
  __u8   file_type;/* file type */
  char   name[EXT4_NAME_LEN]; /* file name */
};

enum {
  EXT4_FT_UNKNOWN,
  EXT4_FT_REG_FILE,
  EXT4_FT_DIR,
  EXT4_FT_CHRDEV,
  EXT4_FT_BLKDEV,
  EXT4_FT_FIFO,
  EXT4_FT_SOCK,
  EXT4_FT_SYMLINK,
  EXT4_FT_MAX
};</code>

When a directory contains few entries, a linear scan is sufficient. For directories with thousands of entries, ext4 uses the

dir_index

feature, which introduces

dx_root

and

dx_entry

structures to index blocks by hash, reducing the number of block reads to one.

<code>struct dx_root {
  struct fake_dirent dot;
  char dot_name[4];
  struct fake_dirent dotdot;
  char dotdot_name[4];
  struct dx_root_info {
    __le32 reserved_zero;
    u8    hash_version;
    u8    info_length; /* 8 */
    u8    indirect_levels;
    u8    unused_flags;
  } info;
  struct dx_entry entries[0];
};

struct dx_entry {
  __le32 hash;  /* hash of filename */
  __le32 block; /* block containing the directory entry */
};
</code>

Lookup steps: compute the filename hash, binary‑search the

dx_entry

array, read the indicated block, and scan its entries for a matching name. This reduces I/O from many blocks to a single block.

Disadvantages of ext‑style File Systems

The main drawback is that ext file systems allocate all structural metadata at format time, lacking dynamic allocation. Formatting very large disks (tens of terabytes) can be extremely slow, even though runtime performance remains acceptable. Different file systems have different trade‑offs, so choose based on specific needs.

linuxInodefilesystemext4extentsdirectory-index
Ops Development Stories
Written by

Ops Development Stories

Maintained by a like‑minded team, covering both operations and development. Topics span Linux ops, DevOps toolchain, Kubernetes containerization, monitoring, log collection, network security, and Python or Go development. Team members: Qiao Ke, wanger, Dong Ge, Su Xin, Hua Zai, Zheng Ge, Teacher Xia.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.