Overview of Alibaba Cloud CDN Live Streaming Architecture and Key Concepts
This article explains the fundamentals of video streaming, media transcoding, and CDN technology, then details Alibaba Cloud's live streaming architecture, including terminology, workflow stages, system components, business functions, and typical application scenarios such as UGC, e‑commerce, sports, gaming, and online education.
Source: Detailed Cloud Computing
Generally, a video is a sequence of images displayed at more than 24 frames per second, which, due to visual persistence, appears as smooth motion to the human eye.
Media transcoding refers to converting audio, video, or other multimedia content from one encoding format to another; the content delivery network (CDN) provides services such as streaming servers, load balancing, routing, video transcoding, recording, anti‑hotlinking, and performance optimization.
This article introduces Alibaba Cloud CDN live streaming system from an overview, architecture, business functions, and scenarios.
1. Video‑Related Terminology
Bitrate: The amount of data transmitted per unit time, usually expressed in kbps; higher bitrate yields higher fidelity but larger file size.
Frame: The smallest unit of video data, a single still image; a series of frames creates motion.
Frame Rate: Number of frames displayed per second; 30 fps is acceptable, 60 fps feels smoother, but gains diminish beyond ~75 fps.
Audio frames can be decoded independently, while video frames include key frames (I‑frames) that can be decoded alone and non‑key frames (P/B‑frames) that depend on preceding frames.
Live‑stream caches only the most recent frames; when a new key frame arrives, older frames are discarded to ensure viewers receive the latest content.
CDN acceleration covers three services: file acceleration, video‑on‑demand (VOD), and live streaming. Alibaba Cloud started with file acceleration, added VOD later, and began supporting live streaming in late 2015.
2. Live Streaming Overview
Typical live streaming formats include mobile live (e.g., Hand‑Taobao, Momo, Inke) and game live (e.g., Douyu, Quanmin TV). From the client perspective, live and VOD both fetch video data from servers, but live cannot be paused or rewound.
Live streaming consists of capturing video, pre‑processing, encoding, pushing, transcoding, distribution, and client playback, as shown in the simplified architecture diagram.
1. Capture: Video is captured from various devices (iOS, Android, PC/OBS).
2. Pre‑processing: Includes beautification, watermarking, blur effects, etc.
3. Encoding: Balances hardware compatibility, bitrate, and quality; iOS often uses hardware encoding, Android mainly software.
4. Push & Transcode: Streams are pushed to the server and transcoded into protocols such as RTMP, HLS, and FLV.
5. Distribution: CDN delivers streams to millions of concurrent viewers.
6. Client Playback: Decoding and rendering on iOS/Android/HTML5 clients, addressing low‑latency and fast‑start challenges.
3. Live Streaming Architecture
The full‑scene solution includes push‑stream endpoints, a live‑stream center for storage and transcoding, intelligent CDN scheduling, and client playback.
Push endpoints use RTMP; playback supports RTMP, HTTP‑FLV, HLS, and clients such as Flash, VLC, HTML5, and mobile apps.
The live‑stream center provides stable upstream, interactive features (IM, co‑hosting), and rich services.
CDN offers over 700 domestic and 300 overseas nodes for smooth delivery.
Client optimizations include first‑frame fast‑open and weak‑network frame‑skip playback.
The publishing‑subscribing model is illustrated: a streamer publishes a stream, and multiple viewers subscribe to it via the CDN.
4. Business Functions and Scenarios
Transcoding is crucial; it changes bitrate and quality, adds watermarks, dynamic templates, and supports delayed transcoding. Periodic screenshots are used for channel thumbnails. Additional features include dynamic configuration, recording, start/stop callbacks, stream authentication, anti‑hotlinking, black‑list, mute, various APIs, stream relaying, pull‑stream triggers, co‑hosting, and audio‑only or video‑only playback.
Monitoring provides real‑time metrics such as bitrate, traffic, online viewers, and frame‑rate; spikes indicate network jitter, and the system can drop frames to maintain quality.
Typical application scenarios:
UGC interactive live (e.g., Inke, Yizhibo).
E‑commerce live (e.g., Taobao Live).
Sports events / large‑scale variety shows (e.g., CCTV5).
Game live streaming (e.g., Quanmin, Panda).
Online education / financial live (e.g., Yicai, Zhitu Education).
In summary, Alibaba Cloud’s live streaming CDN provides a complete solution covering capture, processing, distribution, and playback, and is widely adopted by more than half of the video‑live and VOD platforms.
Architects' Tech Alliance
Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.