Artificial Intelligence 12 min read

MD-VQA: Multi-Dimensional No-Reference Video Quality Assessment for UGC Live Videos

MD‑VQA is a no‑reference video quality assessment model that combines semantic cues from EfficientNetV2, handcrafted distortion metrics, and motion information from ResNet3D‑18 to predict absolute quality of user‑generated live videos, trained on the large TaoLive dataset and achieving state‑of‑the‑art SRCC and PLCC results that are already deployed for real‑time monitoring on Taobao’s streaming platform.

DaTaobao Tech

Mar 20, 2023

MD-VQA: Multi-Dimensional No-Reference Video Quality Assessment for UGC Live Videos

MD-VQA is a no‑reference video quality assessment (VQA) model designed for user‑generated content (UGC) live videos, such as short videos and live streams on Taobao. The model integrates multi‑dimensional features—including semantic, distortion, and motion cues—to predict absolute video quality without requiring a pristine reference.

The authors constructed a large‑scale UGC video quality dataset called TaoLive, containing 3,762 videos across diverse content categories and resolutions (720p and 1080p). Each video was encoded with eight distortion levels, and 165,528 subjective quality scores were collected from 44 expert and consumer participants following ITU‑R BT.500‑13 guidelines.

MD-VQA extracts semantic features from the last four layers of a pre‑trained EfficientNetV2, hand‑crafted distortion features (blur, noise, blockiness, exposure, color), and motion features from a pre‑trained ResNet3D‑18. Frame‑level semantic and distortion features are fused temporally using absolute differences between adjacent frames. Spatial‑temporal fusion is performed via concatenation, multi‑layer perceptrons, and linear mappings, followed by three fully‑connected layers that regress the final quality score. Mean Squared Error (MSE) is used as the loss function.

Extensive experiments on public benchmarks (LIVE‑WC, YouTube‑UGC+) and the proprietary TaoLive dataset show that MD‑VQA outperforms state‑of‑the‑art methods in both Spearman Rank Order Correlation Coefficient (SRCC) and Pearson Linear Correlation Coefficient (PLCC). Ablation studies confirm the contributions of semantic, distortion, and motion features, as well as the absolute‑error and feature‑fusion modules.

MD‑VQA has been deployed in Taobao’s live‑streaming and short‑video services, enabling real‑time quality monitoring, automatic quality‑level filtering, and integration with Taobao’s custom S265 encoder and video enhancement pipelines, thereby improving overall user experience.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

No-Reference video quality assessment UGC

Written by

DaTaobao Tech

Official account of DaTaobao Technology

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.