Frontend Development 11 min read

Design and Implementation of a WASM Demuxer for WebCodecs Video Frame Extraction

The project extracts FFmpeg’s demuxing logic into a lightweight WebAssembly module that feeds container‑agnostic video packets to WebCodecs, enabling fast, low‑cost frame extraction across many formats and cutting cover‑generation latency by ~40% while reducing container‑related failures by ~72%.

Bilibili Tech

Dec 3, 2024

Design and Implementation of a WASM Demuxer for WebCodecs Video Frame Extraction

Background: Bilibili’s web upload page requires video frame extraction for cover, category, and tag recommendation. Historically this was done with WebAssembly + FFmpeg. Since last year, WebCodecs has been introduced to improve performance, but it lacks demuxing capabilities, limiting support for formats such as FLV and AVI.

Problem: Existing JavaScript demuxers (mp4box.js, custom mkv‑demuxer) cover only MP4 and MKV. Adding support for each additional format incurs high development cost and low ROI, while high‑quality JS demux libraries are scarce.

Goal: Provide a low‑cost, generic demuxing solution for WebCodecs that supports as many video formats as possible.

Proposed Approach: Reuse FFmpeg’s extensive demuxing support via WebAssembly and combine it with the native decoding performance of WebCodecs. The short‑running demux step is handled by the WASM FFmpeg component, while the long‑running decode step is delegated to WebCodecs.

Core Idea: Extract the demux part from WebAssembly + FFmpeg into an independent WASM demuxer. The implementation steps are:

Add C functions to obtain the data required by WebCodecs decoders.

Write JS glue code (using Emscripten’s cwrap) for bidirectional communication between JS and C, passing demuxed data.

Adapt the frame‑extraction SDK to consume the raw data and feed it to WebCodecs.

Key Data Structures: Two trimmed FFmpeg structures are defined – WebAVStream (contains codec parameters, start time, duration, etc.) and WebAVPacket (contains key‑frame flag, timestamp, size, and data). Functions get_av_stream and get_av_packet locate the appropriate video stream and packet, convert them to the new structures, and return them to JavaScript.

Codec String Generation: WebCodecs’ VideoDecoder.configure requires a valid codec_string. The solution extracts codec configuration from AVStream/AVPacket, re‑uses FFmpeg’s internal logic (e.g., ff_isom_write_vpcc for VP9) to build the string, and verifies it against Chromium’s video_codec_string_parsers.

JS‑C Communication: C functions are wrapped with Module.cwrap to be callable from JS. After execution, the returned pointer is read via Module.getValue, assembled into a JavaScript object, and sent back through postMessage. The reverse direction (C invoking JS) follows the same pattern.

Integration into the Frame‑Extraction SDK: The WASM demuxer runs inside a Web Worker. Its postMessage interface is promisified, and the output is adapted to WebCodecs’ VideoDecoderConfig and EncodedVideoChunk formats.

Results: Deploying the WASM demuxer together with WebCodecs reduced the 90th‑percentile cover‑generation latency by ~40% and decreased the failure rate caused by unsupported containers by ~72%.

Additional Offering: An npm package named web-demuxer extracts the demuxer portion of WebAssembly + FFmpeg, resulting in a minimal gzip size of 115 KB (supporting MP4 and MKV). It enables video frame extraction with just a few lines of code and also provides a ReadableStream interface for more complex scenarios such as playback.

Conclusion: By modularizing FFmpeg’s demuxing capabilities and coupling them with native WebCodecs decoding, developers can achieve high‑performance, format‑agnostic video processing on the web. Future work includes advocating for native container support in WebCodecs and further expanding the format coverage.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Frontend Development WebAssembly ffmpeg WebCodecs media processing Video Demuxing

Written by

Bilibili Tech

Provides introductions and tutorials on Bilibili-related technologies.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.