Backend Development 19 min read

Design and Optimization of a High‑Performance IM Instant Messaging Platform

This article details the architectural decisions, network protocol choices, message framing strategies, and server‑level optimizations—including Netty adoption, TCP handling, token management, load balancing, NIC queue configuration, and CPU affinity—that enable a scalable, low‑latency instant messaging service supporting millions of concurrent connections.

HomeTech

Dec 6, 2022

Design and Optimization of a High‑Performance IM Instant Messaging Platform

1. Introduction The early C‑end product used third‑party SaaS for instant messaging, which limited extensibility and security, prompting the development of a self‑controlled IM platform.

2. Network Communication Framework and Protocol

We chose Netty for its rich protocol support, high performance, and active community. Netty advantages include simple onboarding, built‑in codecs, high throughput, flexible threading, and stability.

TCP, being a stream protocol, can cause packet fragmentation. Common solutions are fixed‑length messages, delimiter‑based framing, length‑prefixed headers, or custom protocols. A typical message format consists of a fixed‑length header (type and body length) followed by a body encoded in JSON, Protobuf, etc.

3. Architecture Design

The system is divided into three layers:

Middleware layer : handles token acquisition, caching, and renewal.

Core layer : connection management, heartbeat (ping every 50 s), reconnection strategy (exponential back‑off), API wrappers, logging, and local database for sessions and messages.

Protocol layer : encodes/decodes messages, sessions, and commands.

Message types include text, image, emoji, voice, and video. Media are uploaded separately and referenced by URI with optional base64 thumbnails.

Message diffusion uses two models:

Read diffusion : a single copy stored; readers fetch it, saving storage but increasing read complexity.

Write diffusion : each user gets a copy; simplifies reads but increases storage and write load. A hybrid approach applies write diffusion to active users and read diffusion to inactive ones. SDK architecture includes token caching, automatic reconnection, and heartbeat mechanisms to maintain persistent TCP connections. 4. Server Optimization To support millions of connections and high QPS, we increased file descriptor limits:

* soft nproc 1500000
* hard nproc 1500000
* soft nofile 1500000
* hard nofile 1500000

and kernel parameters:

fs.nr_open = 3000000
fs.file-max = 3000000

Nginx is used as a Layer‑7 load balancer with TLS; port multiplexing alleviates local port exhaustion. Network card tuning includes enlarging the ring buffer and configuring multiple queues:

ethtool -G em1 rx 4096
ethtool -G em1 tx 4096
ethtool -L em1 combined 16

CPU affinity is set per IRQ to balance interrupt handling:

echo 0 > /proc/irq/107/smp_affinity_list
echo 1 > /proc/irq/108/smp_affinity_list

Intel Flow Director is enabled to steer packets to specific queues based on destination ports:

ethtool --features em1 ntuple on
ethtool --config-ntuple em1 flow-type tcp4 dst-port 9500 action 0 loc 1
ethtool --config-ntuple em1 flow-type tcp4 dst-port 9501 action 1 loc 2

Additional optimizations include dedicated servers for latency‑sensitive users, backup domains, and robust token‑cache mechanisms. 5. Conclusion The IM platform has been in production for over two years, serving millions of daily active users across single‑chat, group‑chat, chatroom, and public‑account scenarios, demonstrating the effectiveness of the described design and optimization strategies.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

backend network IM Netty protocol Instant Messaging server optimization

Written by

HomeTech

HomeTech tech sharing

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.