Scaling Productivity on Microservices at Lyft – Part 1: History of Development and Testing Environments
This article details Lyft’s evolution from a monolithic PHP architecture to a micro‑service ecosystem, describing the development and testing environments such as Devbox, Onebox, and pre‑release setups, the challenges of scaling, and the strategies adopted—including Kubernetes migration and automated acceptance testing—to maintain productivity.
At the end of 2018 Lyft’s engineering team split its original PHP monolith into a set of Python and Go micro‑services, enabling independent operation and rapid deployment of hundreds of releases per day. The shift introduced new concerns around language choice, service importance, and testing scalability.
The series is divided into four parts, beginning with the history of Lyft’s development and testing environments, followed by optimizations for faster local development, extensions to the service mesh for staging, and automated acceptance testing for release gating.
Part 1 – Development and Testing Environment History
In 2015, with about 100 engineers, Lyft relied on a single PHP monolith while a few micro‑services emerged for specific use cases. Anticipating growth, they began building a Docker‑based container orchestration platform to provide shared, cost‑effective workloads.
Devbox – Local Development
Launched in early 2016, Devbox automated VM provisioning, package installation, and service startup, allowing developers to pull the latest image, create databases, and start an Envoy sidecar with a single command.
Onebox – Remote Development
Onebox runs Devbox on powerful EC2 instances (r3.4xlarge) to support multi‑service workloads, faster container image downloads, and integration testing on CI. It enables developers to define service dependencies in manifest.yaml and run temporary environments for PR testing.
Integration Testing
Onebox’s cloud infrastructure makes it suitable for CI integration tests, but the growing number of services and dependencies has led to bloated test suites and longer execution times.
name: api
type: service
groups:
- name: integration
members:
- driver_onboarding
- users
tests:
- name: integration
group: integrationPre‑release Environment
Lyft’s pre‑release environment mirrors production without real data, serving as the final staging ground before deployment. It also supports large‑scale traffic simulation to uncover bottlenecks during peak events.
To address scaling challenges, Lyft migrated its development environments to Kubernetes, re‑architecting Devbox, Onebox, and integration testing to be more sustainable for hundreds of micro‑services.
The article outlines three key workflows that must be supported:
Fast local development with quick unit tests and service launches.
Manual end‑to‑end testing in an isolated environment.
Automated acceptance testing for continuous delivery.
Future posts will dive deeper into each workflow, discussing tools, challenges, and lessons learned.
# 2013 (monolith), duration: 1 minute
def test_driver_approval():
"""
Requires:
- api
"""
user = get_user()
approve_driver(user)
assert user.is_approved
# ------------------------------------------------------------
# 2015 (mostly monolithic, a few services), duration: 3 minutes
def test_driver_approval():
"""
Requires:
- api (monolith)
- users
- mongodb
- driver_onboarding
- redis
"""
user = user_service.create_user()
user = driver_onboarding_service.approve_driver(user)
assert user.is_approved
# ------------------------------------------------------------
# 2018 (post‑decomp, microservices), duration: 20 minutes
def test_driver_approval__california():
"""
Requires:
- users
- redis
- experimentation
- fraud
- dynamodb
- messaging
- mongodb
- driver_onboarding
- email
- dmv_checks
- vehicles
- payments
"""
user = user_service.create_user()
user = driver_onboarding_service.approve_driver(user)
assert user.is_approvedOverall, Lyft’s journey illustrates the complexities of scaling a micro‑service architecture and the necessity of evolving development tooling, container orchestration, and testing strategies to sustain engineering productivity.
Top Architect
Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.