Backend Development 12 min read

Design and Architecture of a Scalable Live‑Streaming Message Service

The article outlines the challenges of real‑time messaging in live‑streaming education, presents a multi‑stage backend architecture—including AccessServer, MessageServer, and specialized services—along with caching, clustering, and future enhancements such as connection migration and QUIC to achieve high reliability, low latency, and massive concurrency.

TAL Education Technology

Jul 15, 2021

Design and Architecture of a Scalable Live‑Streaming Message Service

In the era of internet services, instant messaging has become essential for products like WeChat, DingTalk, and QQ, and is also critical in live‑streaming classrooms where interactive features such as quizzes, doodles, and likes demand high reliability and immediacy.

The main challenges identified include frequent user churn in live rooms, high QPS for message forwarding (e.g., 500 × 500 = 2.5 w), real‑time latency constraints, user‑experience limits on screen messages, historical message storage for replay, and maintaining message order across users and rooms.

To address these, the system defines priority levels for messages, adopts read‑expansion storage with Pika for historical data, and uses consistent hashing with Kafka‑like queues to preserve order while minimizing latency.

The architecture evolves through three versions:

Architecture 1.0 : Consists of AccessServer (handling TCP connections, async I/O, and user‑room mapping) and MessageServer (interacting with Redis and Pika, processing login, room entry/exit, and message routing). Consistent hashing directs a room’s traffic to a specific MessageServer.

Architecture 2.0 : Splits MessageServer into three services—MessageServer (room logic), BinMsgServer (doodle handling), and PeerMsgServer (one‑to‑one chat). Caching strategies are refined, and cache synchronization is optimized to reduce unnecessary RPC calls.

Architecture 3.0 : Introduces TcpProxyServer as a layer‑7 proxy supporting multiple business sessions (chat, IM, push, etc.) over a single TCP connection, enabling dynamic routing policies and reducing client resource consumption.

Additional components include a DispatchServer for multi‑cluster IP/port allocation, secondary caches in both AccessServer and MessageServer to alleviate Redis pressure, and a cluster management approach that isolates workloads per business line.

Future plans focus on connection migration to seamlessly recover from AccessServer overload or restart, and the adoption of QUIC (UDP‑based) to improve latency in weak‑network environments.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Distributed Systems real-time messaging Backend Architecture live streaming Scalability message service

Written by

TAL Education Technology

TAL Education is a technology-driven education company committed to the mission of 'making education better through love and technology'. The TAL technology team has always been dedicated to educational technology research and innovation. This is the external platform of the TAL technology team, sharing weekly curated technical articles and recruitment information.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.