Facebook Messenger New Sync Protocol: Push‑Based Snapshot and Incremental Updates with Iris Queue
Facebook Messenger switched from a pull‑based JSON over HTTPS model to a push‑based snapshot‑incremental architecture using MQTT, Thrift, and a dual‑pointer Iris queue backed by MySQL and flash storage, cutting non‑media payload by 40% and reducing send‑failure rates by roughly 20%.
Facebook Messenger is the official instant‑chat application from Facebook, similar to a mobile SMS app, allowing users to send messages for free to friends on Facebook or in the phone address book. Since last year Facebook has been pursuing a “mobile‑first” user experience for Messenger. Recently the Facebook official blog published an article describing their new synchronization protocol and its results.
Client
In the original protocol, Messenger used a pull model to obtain data. When a new message arrived, the client first received a notification that new data was available, then sent a complex HTTPS request and received a large JSON response.
In the new protocol, Facebook switched to a push‑based snapshot‑incremental model. The client first retrieves a message snapshot and then subscribes to incremental updates. When a new message arrives, the client receives the incremental update pushed via the MQTT protocol, eliminating the need for another HTTPS request and allowing the latest message view to be displayed quickly.
Furthermore, they found that transmitting messages and incremental updates in JSON was not efficient enough, so they replaced JSON with Thrift, reducing network payload by roughly 50%.
Server
Typically, message data is stored on rotating disks. In the pull model, before reading data from disk in response to a client query, the data must first be written to disk, meaning the same storage layer holds both real‑time messages and the full conversation history, which cannot satisfy the new sync protocol. They needed to send the same update sequence per user in real time to both the Messenger app and the storage layer.
To achieve this, they built Iris, a fully ordered message‑update queue with two separate pointers: one pointing to the most recent update sent to the Messenger app and another pointing to the most recent update sent to the traditional storage layer. When a message is successfully delivered to the phone or written to disk, the corresponding pointer advances. If the phone is offline, that pointer stalls while new messages continue to enter the queue, and the other pointer moves normally, and vice‑versa. This allows the Messenger app and the traditional storage layer to synchronize at their own rates without interfering with each other.
The queue implements a three‑tier storage model:
Latest messages are sent directly from Iris’s memory to the Messenger app and the disk storage layer.
Messages from the past week are stored in the queue’s backend storage.
Older conversation history and full inbox snapshots are provided by the traditional disk storage layer.
After evaluating scalability, reliability, speed, and flexibility, they chose to implement Iris’s backend storage using MySQL and flash storage, leveraging MySQL’s semi‑synchronous replication feature.
Effect
The new synchronization protocol reduced non‑media data by 40% and, by decreasing network congestion, lowered the rate of user message‑send failures by approximately 20%.
Art of Distributed System Architecture Design
Introductions to large-scale distributed system architectures; insights and knowledge sharing on large-scale internet system architecture; front-end web architecture overviews; practical tips and experiences with PHP, JavaScript, Erlang, C/C++ and other languages in large-scale internet system development.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.