Backend Development 11 min read

Design and Architecture of a Self‑Developed Video Transcoding Core

The team built a custom video‑transcoding core atop FFmpeg libraries, replacing the command‑line tool with modular controllers, pipelines, and parallel tasks that dynamically adapt resolution, frame‑rate, and SEI handling for both low‑latency live streams and high‑throughput VOD, improving scalability and maintainability.

Bilibili Tech
Bilibili Tech
Bilibili Tech
Design and Architecture of a Self‑Developed Video Transcoding Core

Video transcoding converts an uploaded video file into multiple resolution variants by demuxing, decoding, filtering, encoding and remuxing. Bilibili processes massive daily uploads, producing lower‑bitrate streams that improve playback smoothness, reduce bandwidth consumption and standardize codec specifications.

The most widely used server‑side transcoding framework is FFmpeg, which provides low‑level libraries for demux/mux, codec, filters and a command‑line tool. However, the native ffmpeg command line shows several limitations for large‑scale, real‑time scenarios:

Serial pipelines before FFmpeg 7.0 cannot fully exploit multi‑core CPUs, causing bottlenecks in multi‑resolution VOD and live streams.

Live streaming requires dynamic parameter updates, which the static command line cannot handle.

The control logic is scattered in a few .c files, making maintenance and module separation difficult.

Upgrading FFmpeg versions often leads to painful code migrations.

To overcome these issues, a self‑developed transcoding core was built on top of the FFmpeg libraries, replacing the command‑line tool.

Core Architecture

The core abstracts the FFmpeg primitives into modular components. A Controller module orchestrates frame scheduling for both VOD and live streams. For VOD, the controller simply maps input streams to output streams. For live streaming, it handles timed frame pulling, message interaction, and dynamic changes of inputs, outputs or filters without restarting containers.

Each transcoding pipeline (Pipeline) corresponds to one output variant. Inside a pipeline, a Flow processes a single audio‑video stream, and each processing step (filter, encoder, sampler, muxer) is represented as a Task . Tasks inherit from a common PipelineWorker base class and can run either serially or in parallel; parallel workers spawn dedicated threads for frame handling.

Live vs. VOD Transcoding

Live transcoding demands low latency; serial pipelines cause blocking when multiple pipelines compete for CPU resources. By enabling parallel mode, tasks drop frames when internal queues exceed thresholds, preserving stream stability.

VOD transcoding focuses on throughput. Depending on a task’s internal parallelism, serial or parallel execution is chosen to maximize CPU utilization while avoiding unnecessary thread‑switch overhead.

Dynamic Adaptive Transcoding

Live streams using FLV/RTMP may change resolution or frame rate on the fly. The core supports dynamic adaptation:

Resolution adaptation : When a resolution change is detected, the scale filter parameters are recomputed using FFmpeg expression syntax, preserving aspect ratio via a zoom‑style scaling.

Frame‑rate adaptation : Instead of fixed‑frame‑rate (CFR) sampling, the core employs variable‑frame‑rate (VFR) and a VFR‑HALF mode that halves the output frame rate when the source exceeds the target, ensuring uniform sampling and reducing jitter.

SEI (Supplemental Enhancement Information) Management

SEI carries auxiliary data such as subtitles, game scores or HDR color information. Older ffmpeg versions discarded SEI during decoding; newer versions store it in the frame structure but rely on the encoder for writing. The self‑developed core inserts a BSF filter after encoding to uniformly write SEI for AVC, HEVC and AV1 streams, and can optionally drop or merge SEI from discarded frames based on configuration.

In live streaming, SEI is also used to trace the full lifecycle of a stream, enabling precise latency and quality analysis on the client side.

Summary and Outlook

The custom transcoding core was initially deployed for live director‑board workflows in 2020 and has since expanded to live and VOD streaming. Future work includes deeper AI integration (e.g., live subtitles, game scoreboard overlays) and finer‑grained pipeline parallelism to push resource utilization to its limits.

Live Streamingbackend developmentFFmpegParallel Processingadaptive bitratevideo transcoding
Bilibili Tech
Written by

Bilibili Tech

Provides introductions and tutorials on Bilibili-related technologies.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.