iQIYI ZoomAI Video Enhancement Technology: Applications and Technical Details
Jiang Zidong explains iQIYI's ZoomAI video enhancement tech, covering super‑resolution, denoising/sharpening, color correction, scratch removal, and frame interpolation, and its modular deployment across business lines for restoring classic content and boosting low‑resolution media, achieving massive efficiency gains.
The speaker, Jiang Zidong, senior algorithm engineer at iQIYI, introduces the motivation for video enhancement, noting the scarcity of high-quality video resources despite abundant high‑definition hardware, caused by uncontrolled user‑generated content, aged source material, and users voluntarily selecting low bitrate streams.
He outlines four technical areas: background and need for video enhancement; principles of enhancement techniques including super‑resolution, denoising & sharpening, color enhancement, frame interpolation, and scratch removal; the ZoomAI framework and its deployment across iQIYI’s business lines; and finally shares practical resources and engineering insights.
For super‑resolution, he discusses single‑frame and multi‑frame approaches, compares CNN‑based methods (SRCNN, VDSR, FSRCNN, EDSR, DBPN), highlights challenges in network structure design and loss function selection, and explains why iQIYI adopted a single‑pass upsample with global residual and a combination of MSE and gradient loss to preserve edges while avoiding GAN‑induced semantic changes.
Regarding denoising and sharpening, he describes classic models such as DNCNN and CBDNet, notes the difficulty of simulating real noise, and presents an end‑to‑end network that jointly performs denoising and sharpening by training on mixed noise and blur.
Color enhancement is presented via black‑box (generative) and white‑box (regression) approaches; iQIYI’s white‑box model predicts exposure, saturation, and white balance, uses paired and synthetic data, and incorporates scene segmentation to maintain temporal consistency in video.
Scratch removal leverages optical flow to detect and fill linear defects under the assumption that scratches do not persist across frames, combined with scene‑change detection to avoid false positives.
Frame interpolation uses optical flow estimation, warping, and fusion, with edge‑preserving enhancements to produce smooth motion, especially useful for low‑frame‑rate cartoons and sports.
The ZoomAI solution bundles these modules into a flexible toolkit: for images it offers super‑resolution, denoising/sharpening, and color enhancement; for video it adds scene cut detection, duplicate frame removal, and per‑scene processing of inter‑frame and intra‑frame algorithms, allowing business‑specific configuration.
Applications include restoration of classic TV dramas (e.g., repairing noise and scratches), enhancement of low‑resolution variety shows, animation color boosting, 3D cartoon frame interpolation, and enrichment of platform thumbnail images, delivering up to 500× efficiency gains over manual work.
In the Q&A, he addresses objective vs. subjective evaluation, data synthesis for denoising and scratch removal, the role of gradient loss, mobile‑endpoint implementation considerations, and the distinction between generative and regression models for on‑device processing.
iQIYI Technical Product Team
The technical product team of iQIYI
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.