How We Cut 60% of Backend Load with Go’s singleflight and Smart Caching
This article examines how the mic‑queue list in a live‑streaming app caused heavy server load due to three‑second polling, and details a systematic performance‑optimization process—including analysis, reducing I/O, employing Go’s singleflight caching, and switching JSON libraries—to cut resource usage by up to 60% while preserving user experience.
Background
The mic‑queue list appears at the top of a live‑streaming room and updates by the client every 3 seconds, putting huge pressure on the server.
We need to optimize performance without degrading user experience.
Idea
As Peter Drucker said, "If you can’t measure it, you can’t improve it." We measure by the number of machines used before and after optimization; reducing machines directly cuts cost.
Performance optimization focuses on two goals: reduce computation and reduce I/O . First, analyze the current bottlenecks; blind changes often add complexity without benefit.
Analysis
Factors that cause the mic‑queue list to change include user entry/exit, gifts, public chat, sharing, and privilege changes (avatar frames, certifications, medals, levels, VIP, etc.). Because many variables affect the list, a push‑based approach is impractical, and caching duration must be short to keep updates timely.
Core request flow:
1. Client fetches the mic‑queue list every 3 seconds.
2. Retrieve a page of users from a Redis ZSET.
3. Call the privilege aggregation service to get each user’s privilege info (this service calls dozens of downstream services).
4. Serialize the result to JSON.
Path 1 – Frequent Polling
Polling every 3 seconds generates high I/O; a room with 10 000 online users can produce >3 000 QPS. Switching to push is unrealistic due to many change sources. Extending the interval to 5–10 seconds harms user experience.
Path 2 – ZSET Retrieval
This step involves a single I/O operation and does not need optimization.
Path 3 – Privilege Aggregation
This step is I/O‑heavy because one request triggers dozens of downstream calls, which may cascade further. Since every user in the same room sees the same list, caching is ideal. We introduce singleflight to ensure only one request per room reaches downstream services.
Path 4 – JSON Serialization
Serialization is CPU‑intensive, especially when a user’s privilege data can reach dozens of kilobytes. Two improvements are applied:
Replace the JSON library with a higher‑performance one such as json‑iterator.
Reduce the amount of data serialized by decreasing page size (e.g., from 20 items to 10).
Coding
We use Go’s singleflight library, which guarantees that at any moment only one request for the same key reaches downstream services; other concurrent requests wait and receive the cached result.
Implementation is straightforward: wrap the downstream call with singleflight.Do .
Effect
After optimization, the privilege aggregation service’s request volume dropped dramatically, allowing us to halve the number of machines while also reducing CPU usage. Overall, we achieved a 60% reduction in privilege‑aggregation machines, with cascading benefits to downstream services.
Deep Dive
singleflight is part of Go’s standard library. Its source is under 50 lines and works by using a mutex, a wait group, and a map to store in‑flight results.
Summary
This case study demonstrates the 80/20 principle: a few targeted, well‑understood optimizations yielded substantial performance gains. Understanding the system, analyzing bottlenecks, and applying simple techniques like caching and singleflight are key to effective backend optimization.
Inke Technology
Official account of Inke Technology
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.