Microsoft's Largest Git Repository and the Role of GVFS
The article describes how Microsoft migrated its Windows development codebase to a single 300 GB Git repository containing 3.5 million files, the challenges this scale created for Git, and how the Git Virtual File System (GVFS) was developed and open‑sourced to address performance issues.
Harry, the Vice President for Cloud Developer Services at Microsoft, leads the Team Services product (also available on‑premises as Team Foundation Services) which supports all internal development teams and serves as Microsoft’s ADLM system; unlike many managers, he frequently writes technical blog posts.
In late May, Harry published a blog titled “The largest Git repo on the planet,” announcing that the Windows development team had migrated its codebase to Git, creating what is claimed to be the world’s largest single Git repository with about 3.5 million files and roughly 300 GB of data, now used by over 3,500 of the 4,000‑plus engineers.
During the four‑month period after migration, the repository recorded more than 250,000 commits, averaged 8,421 pushes per day, saw about 2,500 pull requests each workday reviewed by 6,600 reviewers, maintained 4,352 active branches, and triggered roughly 1,760 official builds daily.
Because Git was not originally designed for repositories of this magnitude, Microsoft created the Git Virtual File System (GVFS) to solve performance problems; GVFS was demonstrated at Microsoft’s Build conference, open‑sourced on GitHub, and is being added to the Git for Windows client with upcoming support for Linux and macOS.
Microsoft love git :-)
Harry also answered several technical questions from the audience:
When migrating from Source Depot to Git, historical commits were not transferred; the old SD archive must be consulted for legacy history.
With GVFS, Git still functions as a distributed version‑control system, but files are virtualized and fetched on demand, raising concerns about needing network connectivity during work.
Large monolithic repos can blur subsystem boundaries and create undesirable dependencies; while Microsoft prefers many small repos, the Windows OS codebase is too intertwined to split, similar to approaches taken by Google and Facebook.
Although Google’s Android repo is large, it is not a single Git repository; Windows’ repo remains the largest monolithic Git repo.
The massive repo does increase full rebuild times, but the team mitigates this with parallel builds and caching, and plans to write a dedicated blog on build‑time optimization.
Microsoft is increasingly open about its tooling research; a related article titled “Git at scale: Technical Scale Challenges” discusses the company’s considerations and solutions for operating massive Git repositories.
DevOps
Share premium content and events on trends, applications, and practices in development efficiency, AI and related technologies. The IDCF International DevOps Coach Federation trains end‑to‑end development‑efficiency talent, linking high‑performance organizations and individuals to achieve excellence.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.