Fundamentals 6 min read

How Journal File Systems Prevent Data Loss After Crashes

Journal file systems protect against data corruption caused by power loss or crashes by recording each write operation as a transaction in a dedicated log, then committing the changes only after the log is safely stored, enabling replay to restore consistency.

Efficient Ops

Mar 3, 2019

How Journal File Systems Prevent Data Loss After Crashes

The key problem a file system must solve is preventing data corruption after power loss or system crashes. Such failures cause damage because file writes are not atomic; they involve both user data and metadata (superblock, inode bitmap, inode, data block bitmap), so an interruption can leave the system inconsistent.

A simplified write sequence involves:

Allocating a data block from the data block bitmap.

Adding a pointer to that block in the inode.

Writing user data into the block.

If any step is interrupted, various inconsistencies arise:

Step 2 completed, step 3 not: the file thinks it owns the block, but the block contains garbage.

Step 2 completed, step 1 not: metadata says the block is free while the file has claimed it, leading to possible double allocation.

Step 1 completed, step 2 not: a block is allocated but unused, wasting space.

Step 3 completed, step 2 not: user data is written but the file does not reference the block.

Journal file systems were created to solve these issues. Before performing the actual write, the system records each step as a transaction in a dedicated log (write‑ahead logging). Only after the log is safely stored does the system proceed to write metadata and user data to disk (checkpoint). If a crash occurs, the log is replayed on the next mount to restore consistency.

Because a log entry may be larger than the disk’s atomic write size (typically 512 bytes), each entry is terminated with a special end‑marker. Only entries with a valid end‑marker are considered complete; incomplete entries are discarded, ensuring the log contains only whole transactions.

Log space is limited and reused cyclically, so logs are often called circular logs. The journal workflow consists of:

Journal write – record the transaction in the log.

Journal commit – write the end‑marker after the log entry is safely stored.

Checkpoint – perform the real write of metadata and user data.

Free – reclaim the log space.

When both metadata and user data are logged (Data Journaling), each write is performed twice, which can halve performance, especially for large files. An alternative, Metadata (or Ordered) Journaling, logs only metadata; user data is written first, then the log, guaranteeing that a valid log implies valid user data. Most file systems, such as Linux EXT3, support both modes.

Reference: Crash Consistency – FSCK and Journaling.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

metadata Data Consistency file system Write-Ahead Logging journaling

Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.