Understanding Git Internals: Repository Structure, Objects, and Core Commands
This article explains how Git works internally by describing the .git directory layout, the creation of blob, tree and commit objects during init, add, and commit operations, and how branches, HEAD, remote configuration, reflog, and diff commands manage version history.
Git is a distributed version‑control system whose behavior becomes clear once the internal structure of a repository is understood. Initialising a repository with git init creates a .git directory containing sub‑folders such as HEAD , config , objects , refs , hooks , and info .
The .git/objects directory stores three object types: blob (file contents), tree (directory listings), and commit (snapshot metadata). When git add is run, Git hashes the file content with SHA‑1, creates a blob object, and records the filename and mode in the index file ( .git/index ).
Executing git commit writes a tree object that references the current blobs, then a commit object that points to that tree and records parent commits, author, and message. The new commit hash becomes the tip of the current branch, while .git/HEAD always points to the active branch reference.
Branches are lightweight pointers stored under .git/refs/heads . Creating a branch with git branch adds a new file containing the commit hash; switching branches updates HEAD . Deleting a branch removes its pointer but leaves the underlying objects untouched.
Remote repositories are linked via entries in .git/config under a [remote "origin"] section. Pushing with git push -u origin master transfers new objects, updates remote refs, and creates corresponding entries under .git/refs/remotes/origin .
Git also records every reference update in reflog files, allowing recovery of lost commits. Commands such as git reflog , git diff , and git log rely on these internal structures to display changes, compare snapshots, and trace history.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.