Optimizing Network Request Success Rate in Mobile Apps: iQIYI’s Practice and Lessons
iQIYI boosted its mobile app’s network request success rate to 99.7% by implementing detailed monitoring, layered retry strategies, optimized timeouts, reduced concurrency, smarter compression, IPv4 preference, TLS 1.3, and connection‑racing techniques, and now targets 99.9% using multipath, QUIC and intelligent traffic scheduling.
Mobile applications rely on three key dimensions of network performance—success rate, latency, and traffic consumption. The success rate (i.e., the proportion of successful network requests) directly determines service availability such as video playback, ad display, and payment flow. This article describes how iQIYI improves the network request success rate of its APP.
Factors that cause request failures
Two broad categories of failure factors are identified:
Non‑improvable factors : iOS system permission restrictions, airplane mode or lack of network, and router failures. These can only be detected and reported to users.
Improvement‑possible factors : weak cellular/Wi‑Fi signal, DNS failures, partial carrier node outages, load‑balancer failures, server errors (HTTP error codes), and business‑logic errors (parsing failures).
Statistics from iQIYI show that about 3.8% of usage time is spent without network connectivity, while roughly 9% occurs under very weak cellular signal, highlighting the importance of good offline prompts and robust handling of weak networks.
Monitoring and data collection
To raise the success rate, a monitoring system feeds detailed request metrics from the baseline network library into an APM platform. Because of storage constraints, only 2% of users are sampled, but critical business flows (e.g., the home page) are collected in full.
Retry mechanisms provided by the baseline network library
Analysis of APM data shows that timeout errors (-1001) account for about 90% of failures, followed by SSL and DNS errors. Consequently, retry becomes the primary optimization lever. Four retry strategies are offered:
IP direct retry : keep the scheme unchanged, replace the host with a direct IP to bypass DNS resolution.
Super‑pipeline retry : route requests through an internal HTTP‑based proxy (similar to a remote Charles proxy), changing the scheme to HTTP to avoid SSL and downgrade HTTP/2 to HTTP/1.1.
HTTP retry : change the scheme to HTTP while keeping the original host, suitable for non‑critical services.
Original‑URL retry : repeat the request without any modification; used only when the business side decides.
The priority order of these retries is IP direct > Super‑pipeline > HTTP > Original‑URL. Combining retries raised the home‑page CARD interface success rate to 99.76% by the end of Q1 2020.
Additional factors influencing success rate
H2 vs. HTTP/1.1 : Prefer H2 for the first attempt; on failure, fallback to HTTP/1.1 because a single TCP connection can become a bottleneck under poor network conditions.
Timeout settings : Use a shorter timeout for the initial request and a longer timeout for retries. Note that NSURLSession timeout reflects TCP packet‑level timeout, not overall request duration.
Request concurrency : Excessive parallel requests can degrade success rate, especially on IPv6. Reducing concurrency improved success rates from 98.2% to 99.85% for a video‑record upload API.
Payload size : Smaller response bodies reduce the number of TCP packets and thus the chance of loss. iQIYI is migrating from gzip to higher‑ratio brotli compression.
Robustness and fault‑prevention measures
Beyond retries, the following measures were adopted:
Super‑pipeline robustness: using dual‑region IP failover reduced error rates from 28.96% to 3.95%.
Prefer IPv4 in dual‑stack environments when IPv6 performance lags.
Adopt TLS 1.3 with 1‑RTT handshake (supported from iOS 12.2) to cut SSL handshake failures.
Composite connection racing: test multiple IPs and select the fastest, discarding bad IPs.
After these optimizations, iQIYI APP achieved an overall business success rate of 99.7% (network layer 99.77%) by the end of Q1 2020, excluding offline scenarios.
Future goals and planned optimizations
The next target is to reach a 99.9% success rate for key business flows (excluding no‑network cases). Planned measures include:
Multipath: automatically switch to cellular when Wi‑Fi is a “fake” connection (iOS 9+).
QUIC: evaluate UDP‑based transport once carrier UDP loss diminishes.
Intelligent concurrency scheduling: more precise weak‑network detection and traffic prioritization.
Push‑enabled HTTPDNS: instantly blacklist faulty IPs via APM integration.
iQIYI Technical Product Team
The technical product team of iQIYI
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.