Fundamentals 22 min read

Live Streaming Process Model: Capture, Sampling, Encoding, and Audio Channel Technologies

This article explains the live streaming workflow, detailing audio and video capture, digital sampling rates and bit depths, various sound channel configurations from mono to immersive formats, and common audio encoding methods such as PCM, AAC, MP3, and FLAC.

New Oriental Technology
New Oriental Technology
New Oriental Technology
Live Streaming Process Model: Capture, Sampling, Encoding, and Audio Channel Technologies

Live Streaming Process Model

The live streaming workflow consists of five main stages; this article focuses on the first two: capture and encoding.

1. Capture

1.1 Audio Capture

Audio capture converts sound waves into digital data using analog‑to‑digital converters (ADC) and can be reversed with digital‑to‑analog converters (DAC). Two key parameters are sampling rate (frequency) and sampling size (bit depth).

Sampling Rate

Sampling rate indicates how many samples are taken per second, measured in hertz (Hz). Higher rates improve fidelity but increase data size.

8,000HZ   --   telephone sampling rate (sufficient for speech)
11,025Hz  --   AM broadcast sampling rate
24,000Hz  --   FM broadcast sampling rate
44,100Hz  --   audio CD sampling rate
47,250Hz  --   recorder
48,000Hz  --   digital TV, DVD, professional audio

Higher rates exist, but frequencies above 48 kHz are inaudible to most listeners.

Sampling Size (Bit Depth)

Bit depth determines how many discrete amplitude levels each sample can represent. Common depths are 8 bits (256 levels, low quality) and 16 bits (65 536 levels, CD quality). Higher bit depth yields finer detail at the cost of larger files.

1.2 Sound Channels

Channel configurations describe how many separate audio tracks are recorded and reproduced. Evolution includes:

Mono (1.0) – single speaker.

Stereo (2.0) – left and right speakers.

5.1 – five speakers plus a subwoofer.

7.1 – adds two rear speakers to 5.1.

Dolby Atmos / DTSX – object‑based audio that can map sounds to any speaker layout, including overhead speakers.

Object‑based mixing allows a single audio track to be rendered for any speaker configuration, simplifying production for immersive formats.

2. Encoding

2.1 Audio Encoding

Raw audio is captured as PCM (Pulse‑Code Modulation) data, an uncompressed binary representation of the sampled waveform. PCM offers lossless quality but large file size, so it is usually compressed.

Compression Types

Lossy compression removes perceptually irrelevant data (e.g., MP3, AAC, OGG), while lossless compression retains all original information (e.g., FLAC, ALAC, APE).

Common Formats

Format

Characteristics

WAV

Uncompressed PCM with a 44‑byte header; excellent quality, large size.

MP3

Lossy; good compression at >128 kbps, widely supported.

AAC

Modern lossy codec; high quality at low bitrates, dominant in live streaming.

FLAC

Lossless; retains PCM quality with moderate compression.

APE

Lossless; higher compression ratio than FLAC, less common.

Live streaming platforms typically use AAC for audio because it balances quality and bandwidth.

live streamingsamplingAudio Processingaudio encodingsound channels
New Oriental Technology
Written by

New Oriental Technology

Practical internet development experience, tech sharing, knowledge consolidation, and forward-thinking insights.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.