Xiaomi Push Service: Architecture, Features, and Performance Insights
The article details Xiaomi's push notification service architecture, covering its protocol stack, server and client components, scalability strategies, security measures, performance metrics, and lessons learned from major refactorings, illustrating how the system handled massive traffic during the 11.11 promotion.
During the 11.11 promotion, Xiaomi's push service sent 965 million messages, averaging 670,000 per minute, and the backend operated smoothly without any congestion.
We interviewed Wang Xuanran, Xiaomi's Project Director, about the service's architecture, characteristics, and performance.
Basic Technical Architecture
The core of the push service is its protocol. Xiaomi Push uses a protocol evolved from the earlier MiTalk system, which originally adopted XMPP. After several rounds of simplification and refactoring, XMPP now serves only as a data transport layer, while various independent business channels run on top. The push channel transmits data using a binary protocol based on Thrift.
The server-side architecture consists of several layers:
XMPP Front‑end: Maintains long connections with clients, handling XMPP requests via the EJabberd project and processing Xiaomi‑specific XMPP messages through the XMQ module.
Middle Layer: Business logic layer that asynchronous‑queues message requests, creates and maintains message queues, and handles client commands such as registration, alias setting, and topic subscription.
HTTP Front‑end: Exposes HTTPS APIs for third‑party apps to send messages and for clients to create accounts.
Data storage uses Xiaomi’s unified HBase store, with MySQL for smaller datasets requiring complex filtering (e.g., topics). A Redis cache sits between the services and HBase to reduce pressure on the latter.
The client SDK has two layers: the SDK layer, which provides APIs, callbacks, and Thrift deserialization logic for app integration, and the PushService layer, which maintains the XMPP long connection and handles message transmission. The two layers communicate via Android Intents. On MIUI, the PushService layer is shared system‑wide, eliminating the need for each app to start its own service.
Feature Implementation
Xiaomi Push supports both single‑send and group‑send modes. Single messages can target a device by its regID or by an alias set by the app. The regID is generated from the device’s IMEI, Android ID, and build serial number to minimize collisions. An alias allows the app to associate its own user identifier with the regID, avoiding the need for the backend to maintain a mapping. Group messages use tags; both client and server can assign tags to devices, and sending to a tag broadcasts to all devices associated with it, with no limit on the number of devices per tag.
Stability is ensured through a multi‑datacenter deployment. Traffic is normally load‑balanced across data centers, and if one center fails, traffic seamlessly switches to the others. Currently two data centers are in operation, with a third planned.
Security is a key concern. The service employs a double‑layer encryption scheme: the XMPP transport layer encrypts data in transit, preventing tampering or eavesdropping, while the Thrift binary layer encrypts the payload before it is broadcast to the app process, protecting against interception and forgery.
Performance Metrics
During the 11.11 promotion the request volume remained within the system’s design capacity. The architecture can handle peak loads of up to 10 million messages per minute; typical traffic is at least 400 k per minute, with peak traffic reaching 6 million per minute.
To cope with sudden traffic spikes (up to a 200 % increase), the team employs several measures:
Asynchronous queuing, which may increase delivery latency slightly but does not destabilize the system.
Message prioritization, where broadcast messages are processed with lower priority.
Rate limiting to control the frequency of developer‑initiated messages.
Rapid scaling: additional machines can be provisioned and services deployed quickly when load becomes excessive.
Major Refactorings of Xiaomi Push
The service has undergone two significant refactorings.
First, the implementation language was switched from Erlang to Java. The original system was built in Erlang, but the ecosystem was limited, talent scarce, and tooling immature. Migrating to Java lowered the skill barrier, provided richer libraries, and greatly improved development efficiency.
Second, pervasive caching was introduced. Because client usage patterns are highly diverse, frequently accessed operations such as setting aliases or subscribing to topics now check a local cache first; only cache misses trigger backend calls, dramatically reducing load on the core services.
Key Takeaways from Developing Xiaomi Push
Design services for horizontal scalability, favor statelessness or consistent hashing to simplify capacity expansion.
Prioritize monitoring to collect and analyze load, request latency, percentiles, and slow logs, enabling targeted optimizations.
Avoid premature optimization; deliver functional features quickly, then refine based on real‑world metrics.
Adopt agile practices with short daily stand‑ups to respond rapidly to change and continuously improve the system.
Source: http://developer.51cto.com/art/201411/457169.htm
Architect
Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.