Backend Development 10 min read

Concurrency Issues and Race Condition Mitigation in Bilibili's Content Production System

Bilibili tackled race‑condition vulnerabilities in its massive video content pipeline by replacing simple timestamp checks with optimistic locking (CAS) and rate‑limiting locks, adding version verification and observation tools that now eliminate missed reviews and improve security, scalability, and real‑time editing reliability.

Bilibili Tech
Bilibili Tech
Bilibili Tech
Concurrency Issues and Race Condition Mitigation in Bilibili's Content Production System

Background: Bilibili's video business is the heart of its content ecosystem, handling massive video content and user interaction. The platform aims to balance user experience, efficient content distribution, stability, and scalability.

Problem: Rapid growth has brought black‑market activity, spam, and other abusive behaviors that threaten community health. In the content‑production pipeline, multiple concurrent steps create race‑condition risks, leading to missed reviews and security challenges.

Software concurrency basics: A race condition occurs when the system's outcome depends on the order of uncontrolled events. It can arise in logic circuits, multi‑threaded programs, distributed systems, etc. The three necessary conditions are (1) concurrency, (2) a shared object, and (3) mutable state of that object. Typical mitigation techniques include mutexes, atomic operations, and serialization.

Safety in the content‑production workflow: The critical race occurs between user edits and review submission. An early mitigation compared file modification times (mtime) before submission, which is no longer sufficient for current security requirements.

Current solution: Introduce optimistic locking (CAS) and rate‑limiting locks. When an edit updates a version field, the system checks the version; if the submitted version is stale, the update fails and triggers a re‑review flow. This approach, combined with observation tools, has reduced weekly missed‑review incidents to zero.

Future outlook: • Real‑time vs. near‑real‑time pipelines – use optimistic version control for real‑time, message‑queue serialization for near‑real‑time. • Enhance version tracking and merge capabilities to allow safe rollback and conflict resolution. • Implement fine‑grained field‑level permission controls to limit which services can modify specific data fields, reducing unnecessary data conflicts.

Overall, by applying CAS, rate‑limiting locks, and systematic observation, the platform has effectively mitigated concurrency‑related security risks in its content production system.

Backend Developmentconcurrencyoptimistic lockcontent moderationBilibiliRace Condition
Bilibili Tech
Written by

Bilibili Tech

Provides introductions and tutorials on Bilibili-related technologies.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.