Operations 14 min read

Full-Link Pressure Testing Automation Practice for Bilibili's Live Streaming Gifting Business

Bilibili automated full‑link pressure testing for its high‑traffic live‑stream gifting service by adopting traffic co‑location with storage isolation, creating shadow tables, keys and topics, and building a three‑phase, three‑layer framework that analyses links, confirms configurations, and verifies end‑to‑end behavior across all services.

Bilibili Tech
Bilibili Tech
Bilibili Tech
Full-Link Pressure Testing Automation Practice for Bilibili's Live Streaming Gifting Business

This article details Bilibili's practice of implementing full-link pressure testing for their live streaming gifting business, which exhibits high write operations, traffic spikes during major events, and strict real-time data requirements. The traditional pressure testing approaches could not accurately simulate production conditions due to various shielding and blacklist processing for write scenarios.

The article first compares three industry-standard full-link pressure testing approaches: traffic co-location with storage isolation and online stress testing; data marking with logical isolation and online stress testing; and mirror environment or offline testing. Bilibili chose the first approach based on their unified language stack, consistent infrastructure components, and mature service governance.

Bilibili's full-link pressure testing solution consists of three main components: traffic co-location (sharing resources with online clusters during low-traffic periods, using traffic marking to distinguish test traffic), online stress testing (through their pressure testing platform), and storage isolation (creating shadow tables for databases, shadow keys for Redis, and shadow topics for message queues).

The core challenge was testing numerous service modifications across revenue core services, underlying middleware, pressure testing SDK, console, and stress platforms. The authors designed a comprehensive automated testing solution divided into three phases: ensuring basic capabilities through testing new nodes like mirror SDK and pressure testing console; implementing full-link automation for business access and full-process verification; and building platformization and visualization for future scaling.

The automated testing solution includes three main parts: link analysis (using trace tracking and static code scanning tools like biliconfigcheck lint to ensure context propagation), configuration confirmation (configuring pass-through, mirroring, write-discard, and mock rules for interfaces, databases, caches, and message queues), and automated verification (validating interface responses, storage operations, async business flows, and link completeness).

The automation framework was redesigned with three layers: case layer for single-interface and scenario test orchestration, invoker layer for request encapsulation and assertion management, and coverage layer for test coverage statistics. Key modifications included adding a "mirror" identifier controlled by a global variable, implementing trace_toolset for link completeness checking, and adding pressure testing markers to HTTP/gRPC request headers.

Live StreamingSystem StabilityAutomated Testingperformance testingtraffic isolationBilibilifull-link pressure testingshadow storage
Bilibili Tech
Written by

Bilibili Tech

Provides introductions and tutorials on Bilibili-related technologies.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.