R&D Management 13 min read

Design and Implementation of Bilibili's Self‑Developed Video Editing Engine

Bilibili replaced a restrictive third‑party video editor with a self‑developed engine, redesigning architecture for extensibility, manageability and controllable rollout, refactoring hundreds of API calls, enabling draft migration, adding observability, and achieving lower crash rates, faster timelines and stable conversion gains while continuing AI‑assisted feature expansion.

Bilibili Tech
Bilibili Tech
Bilibili Tech
Design and Implementation of Bilibili's Self‑Developed Video Editing Engine

Bilibili's creation platform (创作端) is responsible for producing video manuscripts, and the editing tools constitute the first step in the production workflow. The platform provides basic trimming, intelligent editing, AI features, and template‑based tools to serve a wide range of creators.

Historically, the editing engine was purchased from a third‑party vendor. Over time, several limitations became evident:

Limited customisation and a fixed timeline model, restricting business expansion.

Insufficient technical support and slow response, causing difficulty in reproducing and resolving issues.

Annual licensing costs and additional fees for optional features such as HDR.

To overcome these problems, Bilibili decided to develop its own editing engine, aiming for greater flexibility, higher development efficiency, and lower long‑term marginal cost.

The migration presented several challenges:

Heavy business intrusion with hundreds of direct API calls scattered across code.

Lack of systematic encapsulation, leading to a proliferation of utility and manager classes.

Absence of unified engine lifecycle management, causing conflicts when multiple business modules share a singleton third‑party engine.

Need for seamless opening of drafts created with the old engine in the new engine.

Scale of the change: 12 business categories, 150 atomic capabilities, >3000 low‑level APIs.

To address these issues, a phased rollout was planned:

Phase 1 – Refactor the “粉创” business engine calls.

Phase 2 – Integrate the self‑developed engine into the main editor and support draft migration.

Phase 3 – Extend to cover cover editing, shooting, AI storytelling, and game‑style reports.

Phase 4 – Include dynamic content, membership purchases, mini‑games, and one‑click submission.

The architectural redesign focused on three principles:

Extensibility: Introduce an interface layer that mirrors the original engine API, allowing business modules to depend on interfaces rather than concrete engines.

Manageability: Consolidate fragmented utilities/managers into modular tracks (e.g., CaptionTrack) for unified access.

Controllability: Deploy gray‑scale releases, monitor key technical and business metrics, and only expand rollout when thresholds are met.

Implementation details for the main editor illustrate the transition:

Before the refactor, business code directly invoked the third‑party SDK, resulting in scattered responsibilities and multiple thin Engine wrappers.

After the refactor, business logic interacts with an abstract UpperVideoEditEngine that creates the appropriate engine instance. Specific API calls are consolidated into dedicated xxxTrack classes, and the redundant Engine layer is removed in favour of UpperStreamingVideo management.

Engine switching is performed through a five‑step process:

Refresh gray‑scale status and update each scene’s engine type.

Instantiate the corresponding editing manager for the scene.

Retrieve the scene object.

Perform engine check, switch, destroy, and create as needed.

Obtain the final engine manager instance ( UpperStreamingVideo ).

Draft migration from the third‑party engine involves:

Extract all material types and IDs from the draft.

Request updated material information (including download URLs) from the material middle‑platform based on material type, ID, and target engine.

Download the new assets.

Replace the draft’s asset URLs with the newly downloaded ones.

Adapt engine‑specific parameters (e.g., subtitle coordinate systems, rotation angles, template structures).

Observability was built into the design. Quantitative metrics such as jitter rate, export success rate, export speed, and APM indicators were collected, alongside business conversion rates for the main editor and publishing page. Visual dashboards were created to monitor these metrics.

Initial results showed the self‑developed engine outperforming the third‑party solution in crash rate, timeline initialization time, and first‑frame preview latency. Conversion rates initially rose, but when rollout reached 20 % a slight dip (≈2 pp) was observed, traced to lower conversion in the “smart‑clip” mode and specific large‑size transition assets. Targeted optimisations on those assets restored the conversion advantage, leading to a stable 0.1 pp gain and a rollout up to 50 %.

The project, spanning nearly a year and involving multiple teams (creation, multimedia, testing), required extensive API alignment with the opaque third‑party engine, iterative testing of atomic capabilities, and compromises such as converting asynchronous calls to synchronous ones. The collaborative effort ensured a stable launch and continuous performance improvements.

Going forward, Bilibili will keep enriching its material library, advancing AI‑assisted editing, and lowering creation barriers to empower more creators.

R&D managementVideo EditingEngine ArchitectureBilibiliPerformance Metricsself‑developed engine
Bilibili Tech
Written by

Bilibili Tech

Provides introductions and tutorials on Bilibili-related technologies.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.