Fundamentals 15 min read

Implementing Log Snapshotting in Raft: A Step‑by‑Step Guide

This article provides a comprehensive tutorial on adding log snapshotting (snapshotting) to a Raft‑based distributed key‑value store, explaining the motivation, the snapshot mechanism, and detailed Go code for generating, transferring, applying, and persisting snapshots to reduce log size and improve performance.

Rare Earth Juejin Tech Community

Mar 17, 2024

Implementing Log Snapshotting in Raft: A Step‑by‑Step Guide

The article shares a tutorial on implementing log snapshotting (snapshotting) for the Raft consensus algorithm as part of a distributed key‑value store project based on MIT 6.824.

In Raft each node keeps a growing log; snapshotting replaces old log entries with a compact state (snapshot) that records the system at a specific index, thereby saving storage and reducing transmission overhead.

Snapshot mechanism steps include: (1) generate a snapshot on the leader, (2) transfer the snapshot to followers, (3) apply the snapshot to the follower’s state machine, (4) update the log to discard entries covered by the snapshot, and (5) persist the snapshot for crash recovery.

RaftLog structure :

type RaftLog struct {
    snapLastIdx  int
    snapLastTerm int
    // contains index [1, snapLastIdx]
    snapshot []byte
    // the first entry is `snapLastIdx`, but only contains the snapLastTerm
    // the entries between (snapLastIdx, snapLastIdx+len(tailLog)-1] have real data
    tailLog []LogEntry
}

Snapshot creation method (Raft.Snapshot) with safety checks:

func (rf *Raft) Snapshot(index int, snapshot []byte) {
    rf.mu.Lock()
    defer rf.mu.Unlock()
    LOG(rf.me, rf.currentTerm, DSnap, "Snap on %d", index)

    if index > rf.commitIndex {
        LOG(rf.me, rf.currentTerm, DSnap, "Couldn't snapshot before CommitIdx: %d>%d", index, rf.commitIndex)
        return
    }
    if index <= rf.log.snapLastIdx {
        LOG(rf.me, rf.currentTerm, DSnap, "Already snapshot in %d<=%d", index, rf.log.snapLastIdx)
        return
    }
    rf.log.doSnapshot(index, snapshot)
    // todo persist
}

doSnapshot implementation updates metadata and rebuilds the tail log after discarding old entries:

func (rl *RaftLog) doSnapshot(index int, snapshot []byte) {
    idx := rl.idx(index)
    rl.snapLastTerm = rl.tailLog[idx].Term
    rl.snapLastIdx = index
    rl.snapshot = snapshot

    newLog := make([]LogEntry, 0, rl.size()-rl.snapLastIdx)
    newLog = append(newLog, LogEntry{Term: rl.snapLastTerm})
    newLog = append(newLog, rl.tailLog[idx+1:]...)
    rl.tailLog = newLog
}

Leader sending a snapshot when a follower’s nextIndex is behind the leader’s snapshot:

func (rf *Raft) startReplication(term int) bool {
    // ...
    prevIdx := rf.nextIndex[peer] - 1
    if prevIdx < rf.log.snapLastIdx {
        args := &InstallSnapshotArgs{
            Term:              rf.currentTerm,
            LeaderId:          rf.me,
            LastIncludedIndex: rf.log.snapLastIdx,
            LastIncludedTerm:  rf.log.snapLastTerm,
            Snapshot:          rf.log.snapshot,
        }
        LOG(rf.me, rf.currentTerm, DDebug, "-> S%d, InstallSnap, Args=%v", peer, args.String())
        go installOnPeer(peer, term, args)
        continue
    }
    // ...
}

Follower handling InstallSnapshot RPC with term checks, duplicate‑snapshot detection, installation, and signaling the apply goroutine:

func (rf *Raft) InstallSnapshot(args *InstallSnapshotArgs, reply *InstallSnapshotReply) {
    rf.mu.Lock()
    defer rf.mu.Unlock()
    LOG(rf.me, rf.currentTerm, DDebug, "<- S%d, RecvSnap, Args=%v", args.LeaderId, args.String())
    reply.Term = rf.currentTerm
    if args.Term < rf.currentTerm {
        LOG(rf.me, rf.currentTerm, DSnap, "<- S%d, Reject Snap, Higher Term, T%d>T%d", args.LeaderId, rf.currentTerm, args.Term)
        return
    }
    if args.Term >= rf.currentTerm {
        rf.becomeFollowerLocked(args.Term)
    }
    if rf.log.snapLastIdx >= args.LastIncludedIndex {
        LOG(rf.me, rf.currentTerm, DSnap, "<- S%d, Reject Snap, Already installed, Last: %d>=%d", args.LeaderId, rf.log.snapLastIdx, args.LastIncludedIndex)
        return
    }
    rf.log.installSnapshot(args.LastIncludedIndex, args.LastIncludedTerm, args.Snapshot)
    rf.applyCond.Signal()
}

Applying a snapshot in the apply loop distinguishes between normal log entries and a pending snapshot, sending the appropriate ApplyMsg and updating lastApplied and commitIndex:

// inside applicationTicker()
if snapPendingApply {
    rf.applyCh <- ApplyMsg{
        SnapshotValid: true,
        Snapshot:     rf.log.snapshot,
        SnapshotIndex: rf.log.snapLastIdx,
        SnapshotTerm:  rf.log.snapLastTerm,
    }
} else {
    // send normal log entries
}

Persistence : After creating or installing a snapshot, the Raft state and the snapshot byte slice are saved together via persister.Save to survive crashes.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Distributed Systems Go snapshot Raft Consensus Log Compaction

Written by

Rare Earth Juejin Tech Community

Juejin, a tech community that helps developers grow.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.