Artificial Intelligence 9 min read

Short-Form Video Quality Assessment Competition at CVPR NTIRE 2024: Dataset, Challenge Overview, and Top Winning Solutions

The CVPR NTIRE 2024 short-form video quality assessment competition introduced the KVQ dataset, attracted over 200 teams, evaluated submissions using SROCC and PLCC metrics, and highlighted the winning approaches of SJTU MMLab, IH‑VQA, and TVQE, showcasing advances in AI‑driven video quality evaluation.

Kuaishou Tech

Jul 1, 2024

Short-Form Video Quality Assessment Competition at CVPR NTIRE 2024: Dataset, Challenge Overview, and Top Winning Solutions

The ninth NTIRE workshop (CVPR NTIRE 2024) announced the results of its inaugural short‑form video quality assessment competition, which attracted more than 200 teams over a three‑month development, testing, and submission period. The top three teams—SJTU MMLab, IH‑VQA, and TVQE—secured first, second, and third places respectively.

Background : Short videos have become a dominant media format, but their subjective quality varies widely due to diverse creation modes and complex processing pipelines. To address this, Kuaishou’s audio‑video technology division partnered with the Intelligent Media Computing Lab at the University of Science and Technology of China to launch the first academic competition on short‑video quality assessment.

Dataset (KVQ) : A large‑scale Kwai Video Quality (KVQ) dataset was collected and annotated, containing 4,200 representative short videos covering nine content scenes (landscape, crowd, food, portrait, etc.). The dataset includes various creation patterns (three‑segment, effects, subtitles, live streams) and three typical processing pipelines (enhancement, pre‑processing, transcoding). It is split into training (70%), validation (10%), and test (20%) sets.

Challenge Setup : Participants used the training and validation sets to develop models and submitted predictions via CodaLab. Evaluation employed the widely used SROCC and PLCC metrics for monotonicity and accuracy, as well as ranking‑based metrics (Rank1 for same‑source pairs, Rank2 for different‑source pairs) provided by the KVQ dataset.

Results : Baseline models (VSFA, SimpleVQA, FastVQA) were outperformed by the top ten teams. The final leaderboard awarded SJTU MMLab (Winner), IH‑VQA (2nd Place), and TVQE (3rd Place).

Winning Solutions :

SJTU MMLab employed a Swin‑Transformer for spatial features and a SlowFast network for temporal features, integrating three BI/VQA models (LIQE, Q‑Align, FAST‑VQA) to capture comprehensive quality cues.

IH‑VQA built an ensemble of seven expert models (four regressors, three classifiers). A novel loss combining average absolute error between target quality and frame‑wise predictions with cross‑entropy was used to handle intra‑frame quality variations.

TVQE proposed a hybrid multimodal model that fuses visual and semantic information from two multimodal encoders with a classic CNN to capture technical and aesthetic quality, followed by heuristic fusion of predictions during inference.

The organizers expressed gratitude to all participants and emphasized the importance of continued collaboration between industry and academia to advance short‑video quality assessment technologies.

References

KVQ: Kwai Video Quality Assessment for Short‑form Videos, CVPR 2024

Quality Assessment of In‑the‑wild Videos, ACM MM 2019

A Deep Learning based No‑reference Quality Assessment Model for UGC Videos, ACM MM 2022

Fast‑VQA: Efficient End‑to‑end Video Quality Assessment with Fragment Sampling, ECCV 2022

Q‑Align: Teaching LMMs for Visual Scoring via Discrete Text‑Defined Levels, ICML 2024

NTIRE 2024 Challenge on Short‑form UGC Video Quality Assessment: Methods and Results, CVPR Workshop 2024

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

computer vision deep learning video quality assessment short video AI competition dataset NTIRE 2024

Written by

Kuaishou Tech

Official Kuaishou tech account, providing real-time updates on the latest Kuaishou technology practices.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.