Backend Development 21 min read

How Baidu’s Unified Long‑Connection Service Scales Millions of Real‑Time Connections

This article details Baidu’s internally built unified long‑connection service in Go, covering its motivation, architecture, functional implementation, performance optimizations, multi‑business support, deployment strategy, and lessons learned for delivering secure, high‑concurrency, low‑latency real‑time connectivity across mobile applications.

Architecture & Thinking
Architecture & Thinking
Architecture & Thinking
How Baidu’s Unified Long‑Connection Service Scales Millions of Real‑Time Connections

Introduction

In the mobile‑Internet era, real‑time and interactive services require long‑connection capabilities. This article introduces Baidu’s internally built unified long‑connection service implemented in Go, describing its design, functional implementation and performance optimizations.

Abstract

Long‑connection services keep a persistent bi‑directional channel between client and server, enabling server‑initiated push. Maintaining low latency, high concurrency and high stability is challenging, especially when each business maintains its own service.

Unified Long‑Connection Service Goals

Provide a secure, high‑concurrency, low‑latency, easy‑to‑integrate, low‑cost long‑connection capability for Baidu’s internal apps (live streaming, messaging, push, cloud control, etc.).

Support multi‑business reuse of a single connection.

Offer clear access procedures and external interfaces.

Functional Implementation

Boundary and Requirements

The service must separate its responsibilities from business logic while satisfying diverse business scenarios such as messaging, live‑streaming and push.

Key requirements include connection establishment/maintenance, upstream request forwarding, and downstream data push.

Supported Scenarios

Messaging: unicast and batch‑unicast for private messages and limited‑size groups.

Live streaming: multicast to millions of viewers.

PUSH: batch‑unicast to a fixed audience.

Architecture Overview

The system consists of four layers: Unified Long‑Connection SDK (client side), Control Layer, Access Layer, and Routing Layer.

SDK

Obtain token, access point and protocol from the control layer.

Establish and maintain the connection, trigger reconnection on failure.

Forward business SDK requests to the service.

Receive data from the service and deliver it to the business SDK.

Control Layer

Generate and verify device tokens.

Distribute appropriate access points based on client attributes.

Apply small‑flow control policies.

Access Layer

Manage connections, connection IDs, and groups.

Forward upstream requests to business back‑ends and write back responses.

Handle downstream push to the appropriate SDK.

Routing Layer

Maintains mapping between device identifiers and connection identifiers for push routing.

Core Process

Connection establishment: SDK obtains token and protocol, then connects to the access layer.

Connection maintenance: periodic heartbeat.

Upstream request: business SDK sends request, access layer forwards to business server.

Downstream push: server pushes via routing layer, access layer writes to the connection, SDK delivers to business SDK.

Performance Optimizations

Multi‑Protocol Support

Connection layer abstracts TCP/TLS, QUIC, WebSocket etc., while session layer handles business logic, allowing seamless protocol upgrades.

Request‑Forwarding and Downstream Task Groups

Separate goroutine pools for different business QPS avoid head‑of‑line blocking and reduce GC pressure.

Deployment

Access points deployed in East, North and South China, plus Hong Kong for overseas.

Clustered deployment with domain‑based traffic splitting.

Instance connection limits (100k‑200k) to improve stability.

Business Integration

Typical steps: evaluate required capabilities, estimate user scale, integrate SDK on client, adapt server interfaces, and request resources.

Summary and Future Plans

The service now supports tens of millions of concurrent connections, million‑level upstream QPS and high‑throughput downstream pushes. Lessons learned emphasize clear requirement boundaries, simple yet robust design, and balanced performance‑vs‑operability trade‑offs. Future work focuses on finer‑grained network metrics, intelligent client‑side adaptation, and broader scenario coverage.

real-time messagingbackend architecturegolanghigh concurrencyLong Connection
Architecture & Thinking
Written by

Architecture & Thinking

🍭 Frontline tech director and chief architect at top-tier companies 🥝 Years of deep experience in internet, e‑commerce, social, and finance sectors 🌾 Committed to publishing high‑quality articles covering core technologies of leading internet firms, application architecture, and AI breakthroughs.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.