ReolAudio: A Frontend‑Focused Audio Processing Library for Efficient Long‑Audio Editing
ReolAudio is a lightweight, JavaScript‑based library that replaces memory‑heavy AudioBuffer editing with streaming and random‑access decoding, frame‑based data structures, and a high‑performance AudioWorklet player, dramatically improving memory usage, start‑up time, and waveform rendering for long audio projects.
Background – The internal Hackathon project “音浮” needed a visual, text‑based audio editor that could handle long recordings, but the original AudioBuffer‑based approach suffered from high memory consumption, slow start‑up, poor edit performance, and inaccurate seeking.
Technical challenges – Full‑decoding of long audio files leads to >1 GB memory usage for a one‑hour 44.1 kHz stereo track, duplicate buffers during edits, long initial waveform drawing, and difficulty storing large sessions in the cloud.
Core idea – Switch to streaming and random‑access decoding by operating on audio frames rather than decoded PCM, keeping the original file as the primary data source.
ReolAudio capabilities
Parsing and framing of popular formats (mp3, mp4/m4a, aac, wav, flac, ogg, flv, mid).
Full, streaming, and seek‑based decoding.
High‑performance sample player built on AudioWorkletNode .
Frame serialization/deserialization with compact binary encoding.
Lightweight format detection (getType) that leverages existing frame parsers.
Waveform summarization using per‑frame amplitude sketches.
Architecture – A core layer provides format‑agnostic frame handling, while an upper‑layer wraps it into frameworks such as frame‑based (frame sequence editing) and clip‑based (clip sequence editing). The SamplePlayer class abstracts the AudioWorklet implementation, exposing a simple API for playback, seeking, and buffer management.
FFmpeg vs WebAssembly – Benchmarks on macOS 12/Chrome 96 show native decode times of 250‑290 ms versus 330‑380 ms for a 4‑minute file when using WebAssembly‑compiled FFmpeg, a ~25 % slowdown that is acceptable given the 1.4 MB wasm payload. ReolAudio’s pure‑JS implementation (~80 KB minified) is far lighter.
Serialization example
export enum FrameField {
uri = 1,
index = 2,
offset = 3,
size = 4,
sampleSize = 5,
sampleIndex = 6,
wave = 7,
}
const FrameType: Record
= {
[FrameField.uri]: 'u16',
[FrameField.index]: 'u32',
[FrameField.offset]: 'u32',
[FrameField.size]: 'u16',
[FrameField.sampleSize]: 'u16',
[FrameField.sampleIndex]: 'f64',
[FrameField.wave]: 'wave',
};Format detection – The getType function checks magic numbers and, when needed, parses frames to guarantee accurate identification, avoiding false positives that plague generic libraries like file‑type .
Player implementation
import { SamplePlayer } from 'xxx';
async function main() {
const ctx = new AudioContext({ sampleRate: 44100 });
await SamplePlayer.init(ctx);
const player = new SamplePlayer({
context: ctx,
channelCount: 2,
bufferMaxDuration: 60,
});
player.connect(ctx.destination);
// UI bindings omitted for brevity
player.play();
}
main();Results on the “音浮” project – Memory dropped from >3 GB to <250 MB, start‑up latency fell from ~12 s to <100 ms, and initial waveform rendering went from seconds to under one second while preserving visual fidelity.
Future work – Add more formats (flac, ogg, flv, webm), publish full API documentation, implement a clip‑based editing framework, and explore integration with Web‑DAW applications.
ByteFE
Cutting‑edge tech, article sharing, and practical insights from the ByteDance frontend team.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.