Youzan Testing Environment: Service Chain Isolation and Operational Practices
Youzan created a cost-effective multi-project testing environment by introducing a weakly isolated “service-chain” that propagates identifiers across RPC and REST calls, standardizing entry/exit points, automating provisioning, and integrating the isolated environments into CI/CD pipelines through cross-team collaboration and tooling.
In fast‑moving e‑commerce companies, the importance of a reliable testing environment cannot be overstated. This article shares Youzan's experience in building efficient, multi‑project testing environments, reducing machine costs, and establishing a standardized environment system through collaboration among framework, operations, development, testing, and product teams.
1. Background of Youzan Testing Environments
Youzan has operated several environments over time, including dev (deprecated), daily, qa, perf, and pre5. The perf environment is dedicated to performance testing.
2. Principles of Multi‑Environment Implementation (Service Chain)
Youzan introduced a “service chain” (SC) solution to achieve weak isolation across environments, addressing resource contention and reducing operational machine costs.
2.1 Full‑Link Identifier Transmission (Weak Isolation)
The SC scheme transmits a service‑chain identifier through the entire call chain. Two main isolation approaches are compared:
Strong isolation: High isolation, low application intrusion, but high operational cost.
Weak isolation: Identifier transmission reduces machine cost, but incurs higher application intrusion. Youzan chose weak isolation due to the rise of vertical businesses that heavily rely on shared underlying services.
The design set four goals:
a. Isolate only applications that require changes.
b. Transmit full‑link identifiers to support future stress testing.
c. Keep application intrusion low, delegating most responsibilities to the framework and middleware.
d. Provide a convenient platform for creating isolated SC environments.
The “basic environment” line represents the standard chain, while the “SC environment” line is used when a matching SC node is found. Identifier transmission starts from the web request and propagates through all downstream services.
2.2 Service Chain Routing
SC identifiers must be passed through all entry points, regardless of protocol or framework. Two main protocols are covered:
RPC calls: Framework modifications enable RPC SC routing. Service information with SC identifiers is written to etcd via a web publishing platform, allowing RPC routers to forward the identifier.
REST calls: A unified domain naming convention (e.g., http://A.qa.xxx.com:port/) is enforced. SC information is stored in etcd, and a dedicated sc‑nginx0 machine performs fuzzy matching of service names to route requests. If routing fails, a fallback sc‑nginx1 handles the request.
2.3 Service Chain Entry Points
Two entry implementations are described:
Iron entry (PHP legacy): Host binding, unified gateway, and web proxy add SC identifiers to HTTP headers. Iron services then forward SC identifiers through REST and, where needed, RPC calls.
Node entry: Similar to REST routing, external domains are registered in etcd, and sc‑nginx routes requests to the appropriate Node service.
2.4 Service Chain Exit
External third‑party calls can carry SC identifiers synchronously or asynchronously (via callback URLs), enabling SC propagation beyond internal services.
3. Driving the Environment Adoption
After finalizing the technical solution, Youzan promoted the adoption through several measures:
Upgrading application frameworks and NSQ clients to support SC.
Building a configuration platform to centralize environment settings.
Consolidating and stabilizing the basic environment services.
Creating a web‑based SC environment publishing platform.
Running pilot projects before full rollout.
Establishing common‑issue resolution groups and documentation.
Implementing monitoring, mock data platforms, and alerting for the basic environment.
Defining usage guidelines, conducting extensive training, and setting fault‑severity classifications.
Maintaining detailed issue records for rapid troubleshooting.
4. Integration with Continuous Delivery
Youzan’s DevOps initiative links stable environments to CI/CD pipelines. A typical project flow moves from daily (development) → QA (testing) → pre (pre‑release) → production, with isolated environments preventing interference between development and testing.
Automation of environment provisioning now takes about half an hour, with further improvements planned for containerization and tighter CI/CD integration.
5. Reflections
The sharing concludes with lessons learned: environment complexity requires patience, cross‑team collaboration, and continuous improvement. The author also notes ongoing recruitment at Youzan.
Youzan Coder
Official Youzan tech channel, delivering technical insights and occasional daily updates from the Youzan tech team.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.