Artificial Intelligence 26 min read

Two-Stage Video Restoration Framework for NTIRE 2022 Video Enhancement Challenge

The TaoMC2 framework, a two‑stage pipeline that augments BasicVSR++ with peak‑quality frames and deep residual blocks in Stage I and refines results with a SwinIR transformer in Stage II, leverages progressive training and transfer learning to boost PSNR to 33.16 dB and secured two championship titles and a runner‑up in the NTIRE 2022 video enhancement challenge.

DaTaobao Tech

May 5, 2022

Two-Stage Video Restoration Framework for NTIRE 2022 Video Enhancement Challenge

The NTIRE 2022 competition on video super‑resolution and quality enhancement announced that the TaoVideo team from Alibaba’s Taobao technical video enhancement group achieved two championship titles (Track 1 video enhancement and Track 2 2× super‑resolution) and one runner‑up (Track 3 4× super‑resolution).

The challenge provides a benchmark for compressed video enhancement and super‑resolution using diverse scenes (animals, cities, indoor, parks, etc.) with high‑quality 4K source videos. Three tracks are defined: Track 1 focuses on restoring heavily compressed video (e.g., HEVC), Track 2 adds a 2× up‑sampling requirement, and Track 3 further extends to 4× up‑sampling.

Our proposed solution, named TaoMC2, consists of a two‑stage network. Stage I is based on BasicVSR++ with several modifications: (1) replacement of the second‑order compensation with Peak Quality Frames (PQF), (2) deepening the reconstruction module from 5 to 55 residual blocks, and (3) progressive training where the model is trained incrementally with 5, 15, 25, 35, 45, 55 residual groups. Stage II employs SwinIR, a state‑of‑the‑art image restoration transformer, to further remove compression artifacts and refine the output of Stage I. The two stages are cascaded: compressed frames → Stage I → intermediate results → Stage II → final enhanced video.

Training details: Stage I starts from the open‑source BasicVSR++ weights, fine‑tuned with Charbonnier loss for 300 k iterations using Adam and a cosine‑annealing schedule with warm‑up. Progressive training gradually adds residual blocks. A final fine‑tuning with MSE loss for 100 k iterations follows. Stage II is initialized from the SwinIR denoising model, then fine‑tuned on the combined NTIRE LDV dataset (240 qHD sequences) and a self‑collected YouTube 4K dataset (870 videos). Transfer learning and progressive training reduce training time and improve performance.

Experiments were conducted on NVIDIA V100 GPUs (4‑card setup). Quantitative results (PSNR) show that Stage I alone improves the baseline by 0.326 dB, while Stage II adds another 0.11 dB, reaching 33.16 dB on the offline validation set. Test‑time augmentation (8‑fold flips/rotations) and model ensembling further boost PSNR by ~0.13 dB. Ablation studies confirm the benefits of progressive training, transfer learning, and the two‑stage design.

Subjective visual comparisons demonstrate that the method restores fine details, reduces motion blur, and sharpens object edges in compressed videos. The approach achieved two championships and one runner‑up in the NTIRE 2022 challenge, demonstrating its effectiveness for real‑world video enhancement tasks.

References to related work on video super‑resolution, compression artifact reduction, and vision transformers are provided, covering methods such as BasicVSR, BasicVSR++, SwinIR, EDVR, VRT, and others.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

compression artifact removal NTIRE2022 super-resolution two-stage network

Written by

DaTaobao Tech

Official account of DaTaobao Technology

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.