Operations 14 min read

Facebook’s Scalable Continuous Delivery System: From Cherry‑Pick to Near Real‑Time Deployment

This article explains how Facebook engineered a massive‑scale continuous delivery pipeline—evolving from manual cherry‑pick releases to a near‑continuous deployment system that safely pushes hundreds of code changes per hour to both web and mobile users worldwide.

DevOps Cloud Academy

Aug 25, 2021

Facebook’s Scalable Continuous Delivery System: From Cherry‑Pick to Near Real‑Time Deployment

The software industry has adopted practices such as continuous integration, continuous delivery, agile development, DevOps, and test‑driven development to deliver code faster, safer, and with higher quality. All these methods share the goal of enabling developers to release small, incremental changes quickly and correctly.

Facebook’s development and deployment processes, referenced in chapters 7 and 12 of Continuous Delivery 2.0 , are presented here in detail, illustrating their evolution and the ecosystem of rapid‑iteration techniques that power both web and mobile products.

Before 2017, Facebook used a simple trunk‑based branching strategy with cherry‑picks: three daily pushes of the web site, 500‑700 cherry‑picks per day, and a weekly new release branch that collected changes not yet cherry‑picked.

This system scaled from a handful of engineers in 2007 to thousands, and the speed of code delivery grew with team size. However, as the team expanded, manually coordinating daily and weekly deployments became unsustainable, with over 1,000 diffs per day and up to 10,000 diffs per week.

In April 2016 Facebook introduced a “near‑continuous deployment” mechanism for facebook.com, gradually rolling out to 0.1%, 1%, and 10% of traffic to validate tools and processes while ensuring the new system did not degrade user experience.

After almost a year of planning and development, by April 2017 all production servers were running code directly from the trunk, as illustrated in the visual from Continuous Delivery 2.0 (see Figure 7‑16).

The deployment workflow includes:

Automated internal test suites must pass before code can be committed to the trunk.

The code diff is pushed to internal users.

If everything is fine, the change is rolled out to 2% of production, quality signals are collected, and alerts are monitored.

Finally, the change is released to 100% of production, with the Flytrap tool aggregating user reports and alerting on anomalies.

Facebook’s Gatekeeper feature‑flag system separates feature releases from code versioning, allowing problematic features to be disabled instantly without a full rollback.

Benefits of this near‑continuous release model include eliminating hot‑fixes, supporting global engineering teams with flexible rollout times, driving improvements in build, diff‑review, testing, capacity management, and traffic routing tools, and ultimately delivering a better, faster user experience.

Applying similar techniques to mobile, Facebook built a dedicated stack (Buck, Phabricator, React Native, Infer) and a three‑layer CI pipeline: build, static analysis, and testing. Each commit triggers builds for multiple products (Facebook, Messenger, Instagram, etc.) across all supported architectures, runs linters and Infer for code‑quality checks, and executes thousands of unit, integration, and end‑to‑end tests (Roblectric, XCTest, JUnit, WebDriver). Android alone sees 50,000‑60,000 builds per day.

Mobile release cadence improved from a four‑week cycle to a one‑week cycle, using the same branch/cherry‑pick model and canary releases to 1% of users, with a beta pool of about one million Android devices.

Despite a fifteen‑fold increase in the mobile engineering team and higher deployment frequency, productivity (lines of code per push) and defect rates remained stable, demonstrating that scaling did not compromise code quality.

Facebook’s central release engineering team continues to push improvements, sharing tools and best practices to enhance both developer and customer experiences.

Original Author: Chuck Rossi (Facebook Release Engineering Manager)

Original Title: Rapid release at massive scale

Publication Date: Aug 31, 2017

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Deployment DevOps Continuous Delivery Facebook Scaling release-engineering

Written by

DevOps Cloud Academy

Exploring industry DevOps practices and technical expertise.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.