Fundamentals 13 min read

Understanding Git Internals: Objects, Trees, Commits, Tags, and Packfiles

This article explains how Git stores data by describing the .git directory layout, the four object types (blob, tree, commit, tag), how objects are hashed and organized, and how Git packs objects to save space, answering why a second commit stores a full file.

Xueersi Online School Tech Team
Xueersi Online School Tech Team
Xueersi Online School Tech Team
Understanding Git Internals: Objects, Trees, Commits, Tags, and Packfiles

Git is a powerful tool used daily in software development, and this article explores its internal mechanisms.

When a file is modified and committed, Git does not store a simple diff; it records the complete file as a blob object. The workflow starts with the Working Directory, then git add moves changes to the index (staging area), and git commit creates a commit that records the tree, author, and message.

Running git init creates a .git directory containing configuration, hooks, and, most importantly, the objects , refs , HEAD , and index files. The objects directory stores data using a two‑level hash‑based path, e.g. .git/objects/ab/cdef... , to avoid filesystem limits on the number of files per directory.

Git defines four object types:

Blob : stores file contents only. Example: $ echo 'version 1' | git hash-object -w --stdin 83baae61804e65cc73a7201a7252750c76066a30

Tree : records directory structure, file names, modes, and links to blob objects. Created with git write-tree after staging files.

Commit : points to a tree and includes author, timestamp, and message. Created with git commit-tree or the higher‑level git commit .

Tag : a named reference to a commit, created with git tag -a v1.0.0 -m "test tag" .

Each object can be inspected with git cat-file (e.g., git cat-file -p <hash> to view contents, git cat-file -t <hash> to view type).

Git initially stores objects as loose files. When many objects accumulate, or when git gc is run, Git packs them into a binary packfile ( .git/objects/pack/pack‑*.pack ) along with an index file. Packing reduces disk usage by delta‑compressing similar objects; the newest version is stored fully for fast access, while older versions are stored as deltas.

Running git verify-pack -v .git/objects/pack/pack‑*.idx shows the size of each packed object and its delta relationships.

In summary, a second commit of a modified file stores the complete new file as a new blob; Git may later pack this blob together with others, keeping the original version as a delta to save space.

gitVersion ControlRepositoryobjectscommitpackfile
Xueersi Online School Tech Team
Written by

Xueersi Online School Tech Team

The Xueersi Online School Tech Team, dedicated to innovating and promoting internet education technology.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.