Operations 12 min read

How Per‑Request Diagnosis Transformed Mobile CDN Image Delivery at Sina Weibo

This article explains a PerRequest‑level problem‑location method for mobile internet, detailing how precise packet loss enumeration and a Supply‑Chain‑based logging approach dramatically improved Sina Weibo image quality by identifying and reducing key error codes.

Efficient Ops
Efficient Ops
Efficient Ops
How Per‑Request Diagnosis Transformed Mobile CDN Image Delivery at Sina Weibo

Applicable Scenario

When users of Sina Weibo encounter "white images" while scrolling, the underlying issue often lies in the complex CDN delivery chain, from the mobile device through network access points, edge caches (L1), secondary caches (L2), and finally the origin server.

Traditional Solutions

Manual Reproduction : Repeatedly browsing Weibo on a mobile device, which is unreliable due to the sporadic nature of white‑image occurrences.

Log Analysis : Correlating server‑side (L4/L7, L1/L2, origin) and client‑side logs, a labor‑intensive process akin to finding a needle in a haystack.

User Assistance : Involving users in troubleshooting, which can introduce more noise than insight.

Our Thinking

Mobile internet shifts service‑quality management from static PC environments to highly dynamic mobile contexts, where network conditions change constantly, SDK error codes are opaque, and low‑level diagnostics tools are unavailable on devices.

Solution

1. Enumerating Error Codes

We built a lab setup where a laptop acts as a gateway for a Wi‑Fi hotspot, allowing precise control of packet loss via a custom Linux kernel module. By deliberately dropping SYN, SYN‑ACK, or specific HTTP response packets, we can map each loss pattern to the corresponding error code (e.g., -1001, -1005, 8009, 8000), turning vague codes into clear diagnostics.

2. On‑Site Restoration

Instead of post‑mortem log mining, we aim to reconstruct the exact request path at the moment of failure. We introduce a custom HTTP response header

Supply-Chain

that carries:

Bidirectional TCP four‑tuple information for each node.

Timing metrics for handshake, request/response processing, and downstream handling.

Each CDN node appends its own

Supply-Chain

data, optionally encoded to limit bandwidth impact.

When the mobile client reports a white‑image error, it also forwards the received

Supply-Chain

header back to the backend, creating a closed‑loop view of the entire request path.

PerRequest Diagnosis Technique

By combining precise packet‑loss enumeration with the

Supply-Chain

header, we achieve:

Elimination of user‑side participation in diagnosis.

Accurate reconstruction of the exact network state for each request.

Significant reduction of the log volume required for troubleshooting.

Production tests at Sina CDN showed that after applying this technique, the four major error‑rate metrics became fully understood, and two of them were reduced by 50% and 30% respectively.

Case Insights

Focus on the full supply‑chain of request handling rather than isolated logs.

Leverage TCP four‑tuple as the universal hook between components.

Adopt targeted, data‑driven optimizations instead of blind trial‑and‑error.

The PerRequest method is not a universal cure, but it provides a powerful, scenario‑specific tool for improving mobile CDN service quality.

supply chaincdnDiagnosticsMobile InternetError CodesPerRequest
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.