Backend Development 12 min read

How We Scaled a Live‑Stream Danmu System from PHP to Go for 50K+ Concurrent Users

Facing massive memory usage and latency in a PHP‑based live‑stream bullet chat, we iteratively re‑engineered the system—splitting Redis, limiting broadcasts, sharding rooms, then rebuilding it in Go with distributed room management, concurrent broadcasting, and extensive testing, achieving stable operation for tens of thousands of concurrent connections.

UCloud Tech

Dec 20, 2017

How We Scaled a Live‑Stream Danmu System from PHP to Go for 50K+ Concurrent Users

Early Danmu System

In 2016, when live streaming surged, our company began optimizing the bullet‑chat (danmu) system. The initial version was built with PHP and a Gateway framework, stored all client IDs in Redis, and ran on three machines behind an LVS load balancer using multiple worker processes.

Basic Situation

Implemented in PHP + Gateway.

Client IDs stored in Redis.

Three machines behind LVS provided the service.

Multi‑process workers handled message delivery.

Problems

Huge memory consumption; a 4‑core 8 GB machine hit the limit with ~500 clients.

Each message required fetching all client IDs for the room from Redis, making Redis and internal bandwidth bottlenecks under high concurrency.

Worker‑process count limited per‑machine concurrency, and excess workers wasted resources.

When a room exceeded 2 000 users, latency could reach about one minute.

Temporary Fixes

Split Redis into a dual‑node, four‑instance setup to disperse load.

Limited the number of broadcast messages per time unit; excess messages were dropped.

For rooms that switched from live to on‑demand, a separate danmu system was used for load shedding.

Sharded a single room into multiple sub‑rooms for message processing.

Effect After Temporary Fixes

Redis pressure reduced dramatically.

Single‑machine I/O pressure lowered.

Same hardware could support more live rooms.

However, the fundamental issues remained, prompting a full redesign.

New Danmu System

Challenges

Single rooms may host 50 k–100 k concurrent users.

Sudden traffic spikes during popular streams.

Strict real‑time delivery requirements; high latency degrades interaction.

Each message must be delivered over many long‑lived connections.

Efficient management of massive long‑connections.

Support for user/IP blacklists and sensitive‑word filtering.

Requirements

Choose a language with better memory handling for long‑running high‑concurrency services (Go).

Distributed architecture to horizontally scale to tens of thousands of users per room.

Easy integration of third‑party messages (gifts, system notices).

Prefer in‑memory management for client connections, minimizing database interactions.

Concurrent broadcast support to improve efficiency.

Refactor Approach

Adopt Go as the development language for its strong concurrency support.

Each server manages only the connections it receives.

Implement concurrent room‑wide broadcasting.

Room Management (Code)

type RoomInfo struct {
    RoomID          string
    Lock            *sync.Mutex // room operation lock
    Rows            []*RowList  // slice of rows in the room
    Length          uint64      // total node count in the room
    LastChangeTime  time.Time   // last update time
}

type RowList struct {
    Nodes []*Node // list of nodes
}

Each client connection is wrapped into a Node and placed into a RowList belonging to its room.

type Node struct {
    RoomID        string
    ClientID      int64
    Conn          *websocket.Conn
    UpdateTime    time.Time
    LastSendTime  time.Time // last message send time
    IsAlive       bool      // connection health flag
    DisabledRead  bool      // whether speaking permission is disabled
}

Nodes are grouped into slices; each slice is processed by a goroutine for sequential sending, while a lock protects the room during concurrent operations.

Message Management (Code)

var messageChannel map[string]chan nodeMessage

func init() {
    messageChannel = make(map[string]chan nodeMessage)
}

func sendMessageToChannel(roomId string, nm nodeMessage) error {
    if c, ok := messageChannel[roomId]; ok {
        c <- nm
    } else {
        messageChannel[roomId] = make(chan nodeMessage, 1024)
        messageChannel[roomId] <- nm
        roomObj := &RoomInfo{}
        roomObj.RoomID = roomId
        roomObj.Rows = make([]*RowList, 0, 4)
        roomObj.Lock = &sync.Mutex{}
        go daemonReciver(messageChannel[roomId], roomObj)
        go timerForClean(messageChannel[roomId])
        if roomId == "" {
            go CleanHall(roomObj)
        }
    }
    return nil
}

Each room has its own message channel stored in messageChannel. Incoming messages are pushed to the channel, and dedicated goroutines handle broadcasting and cleanup.

Server Management

A top‑level chatroom is created; all servers connect to it, and messages received by any server are broadcast to the others through this shared room.

Daemon Goroutine Management

Message‑sending goroutine: pulls messages from the channel and concurrently sends them to all RowList instances.

Room‑cleanup goroutine: periodically removes dead nodes and reorganizes room structures to improve efficiency.

Testing

Environment: Cloud VM, 8 CPU / 16 GB.

OS: CentOS 7 (no special tuning).

Test: 15 000 WebSocket connections in a single room, each sending a message that passes blacklist and sensitive‑word filters, then broadcast.

CPU usage: < 5%.

Memory usage: 2 GB (including OS).

Network: peak ~10 Mb/s.

Broadcast latency for 15 000 nodes: 100‑110 ms.

Result: an 8‑core / 16 GB machine can comfortably handle 50 k concurrent connections, with peak capacity near 60‑70 k.

More Sharing

The core implementation has been open‑sourced at https://github.com/logan-go/roomManager for interested readers.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend live streaming Golang websocket Danmu

Written by

UCloud Tech

UCloud is a leading neutral cloud provider in China, developing its own IaaS, PaaS, AI service platform, and big data exchange platform, and delivering comprehensive industry solutions for public, private, hybrid, and dedicated clouds.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.