Artificial Intelligence 9 min read

How Kuaishou Elevates Video Quality and AI Performance at NVIDIA GTC 2023

At NVIDIA GTC 2023, Kuaishou engineers unveiled cutting‑edge solutions ranging from video quality assessment and enhancement, 3D digital‑human live streaming, a custom TensorRT‑based performance framework, large‑scale recommendation model acceleration, to multimodal massive‑model deployment for short‑video scenarios.

Kuaishou Large Model

Mar 31, 2023

How Kuaishou Elevates Video Quality and AI Performance at NVIDIA GTC 2023

Video Quality Evaluation and Enhancement

Kuaishou processes tens of millions of UGC short videos daily, and to deliver clearer visuals each video passes through a rigorous pipeline. The team introduced the KVQ video quality assessment algorithm and the KRP/KEP enhancement solution, which together significantly improve perceived clarity on the consumer side.

Using AI‑driven methods, KVQ was built on extensive internal test sets and iteratively refined to handle content diversity, processing variations, and codec differences. KVQ now powers internal quality monitoring, adaptive encoding, and recommendation pipelines, and has been commercialized through the StreamLake service for external partners.

3D Digital‑Human Live Streaming and Interactive Solutions

Kuaishou’s visual interaction team presented a 3D digital‑human platform built on the KMIP virtual world interaction framework and the KVS virtual broadcasting assistant. In gaming scenarios, digital‑human anchors use KVS to appear as 3D avatars, guiding users through gameplay and enabling real‑time interaction, rewards, and immersive first‑person control.

This technology boosted small‑streamer revenue by over 50% and doubled live‑stream payment rates; a Valentine’s Day campaign featuring virtual anchors attracted 42.45 million viewers with peak concurrent audiences exceeding 30 k.

Custom Performance Optimization Framework Based on TensorRT

Kuaishou’s algorithm engineers introduced an end‑to‑end sub‑graph optimization framework that leverages an AI compiler to analyze and trim performance‑critical sub‑graphs in ONNX graphs, generating optimized TensorRT plugins. This approach enhances inference throughput while reducing compute resource consumption.

Performance Optimization for Large‑Scale Recommendation Models

Facing massive recommendation models, Kuaishou balanced CPU and GPU workloads on a single server, simplifying deployment and maximizing resource utilization. By deeply optimizing CPU algorithms, improving GPU inference efficiency, and caching data on the GPU to cut DRAM accesses, GPU utilization rose from ~20% to nearly 90%, delivering a ten‑fold throughput increase.

Accelerating Multimodal Massive Models for Short‑Video Applications

Addressing challenges of long training times, low inference efficiency, and complex deployment, Kuaishou developed a comprehensive solution encompassing mixed‑parallel training, inference optimization, and streamlined model deployment. Deployed across recommendation, advertising, search, and e‑commerce, these multimodal models achieve significant business gains with modest resource costs.

Overall, Kuaishou’s innovations demonstrate how AI‑driven video quality, digital‑human interaction, and large‑model acceleration can be integrated into a high‑traffic short‑video platform.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

TensorRT Recommendation Systems digital human video quality ai-optimization multimodal models

Written by

Kuaishou Large Model

Official Kuaishou Account

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.