Operations 13 min read

Optimizing Short Video Playback with Preloading and Proxy Caching

By preloading the MP4 header and initial frames and routing playback through a local proxy that caches range‑requested segments in an LRU disk store, the system moves the moov box to the file start (or fetches it separately), cutting short‑video start‑up latency to roughly 800 ms and delivering near‑instant playback.

Xianyu Technology
Xianyu Technology
Xianyu Technology
Optimizing Short Video Playback with Preloading and Proxy Caching

With the rise of short‑video apps, users often experience a noticeable delay when scrolling through video feeds. The goal of this article is to achieve near‑instant video start ("秒开") and a smooth playback experience.

The current videos are 320p H.264 MP4 files. H.264 offers high quality at low bitrate, while H.265 provides better compression but has narrower device support, so H.264 was retained. Container formats were compared: TS (MPEG‑2 Transport Stream), FLV (Flash Video) and MP4. MP4 was chosen for its universal compatibility.

Video playback can be divided into three stages: IO (reading) – fetching data from local storage or server; Parser – interpreting format and protocol; Render – displaying audio/video on screen.

To reduce start‑up latency, the solution pre‑loads the beginning of an MP4 file (including the ftyp and moov boxes plus a few frames). For a typical 30‑second clip, about 51 KB of data is sufficient.

A proxy component is introduced as an intermediate HTTP server. The client requests a local URL; the proxy serves cached data if available, otherwise fetches missing parts from the CDN using range requests. This design decouples the player from the pre‑load logic and allows seamless switching from cached to network data.

Implementation details:

Pre‑load module generates an MD5 task ID, checks for existing tasks, and submits new tasks to a thread pool.

When the network changes (e.g., Wi‑Fi to 3G), ongoing tasks are cancelled to save data.

Proxy runs a local HTTP server, uses an LruDiskCache for file storage, and maps each URL to a single client to avoid duplicate downloads.

During testing, videos with the moov box located at the file tail still exhibited long start times because the player could not start until the entire file was downloaded. Two remedies were proposed:

Re‑encode videos so that the moov box is placed at the beginning (ffmpeg -movflags faststart ).

If re‑encoding is not possible, let the client detect the moov position and issue an additional HTTP range request to fetch the tail segment.

The final results show video start times around 800 ms after deployment, confirming that pre‑loading and proxy caching effectively eliminate the initial waiting period.

OptimizationProxyStreamingcachingVideoMP4Preloading
Xianyu Technology
Written by

Xianyu Technology

Official account of the Xianyu technology team

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.