How Kuaishou Elevates Video Quality and AI Performance at NVIDIA GTC 2023
At NVIDIA GTC 2023, Kuaishou engineers unveiled cutting‑edge solutions ranging from video quality assessment and enhancement, 3D digital‑human live streaming, a custom TensorRT‑based performance framework, large‑scale recommendation model acceleration, to multimodal massive‑model deployment for short‑video scenarios.
Video Quality Evaluation and Enhancement
Kuaishou processes tens of millions of UGC short videos daily, and to deliver clearer visuals each video passes through a rigorous pipeline. The team introduced the KVQ video quality assessment algorithm and the KRP/KEP enhancement solution, which together significantly improve perceived clarity on the consumer side.
Using AI‑driven methods, KVQ was built on extensive internal test sets and iteratively refined to handle content diversity, processing variations, and codec differences. KVQ now powers internal quality monitoring, adaptive encoding, and recommendation pipelines, and has been commercialized through the StreamLake service for external partners.
3D Digital‑Human Live Streaming and Interactive Solutions
Kuaishou’s visual interaction team presented a 3D digital‑human platform built on the KMIP virtual world interaction framework and the KVS virtual broadcasting assistant. In gaming scenarios, digital‑human anchors use KVS to appear as 3D avatars, guiding users through gameplay and enabling real‑time interaction, rewards, and immersive first‑person control.
This technology boosted small‑streamer revenue by over 50% and doubled live‑stream payment rates; a Valentine’s Day campaign featuring virtual anchors attracted 42.45 million viewers with peak concurrent audiences exceeding 30 k.
Custom Performance Optimization Framework Based on TensorRT
Kuaishou’s algorithm engineers introduced an end‑to‑end sub‑graph optimization framework that leverages an AI compiler to analyze and trim performance‑critical sub‑graphs in ONNX graphs, generating optimized TensorRT plugins. This approach enhances inference throughput while reducing compute resource consumption.
Performance Optimization for Large‑Scale Recommendation Models
Facing massive recommendation models, Kuaishou balanced CPU and GPU workloads on a single server, simplifying deployment and maximizing resource utilization. By deeply optimizing CPU algorithms, improving GPU inference efficiency, and caching data on the GPU to cut DRAM accesses, GPU utilization rose from ~20% to nearly 90%, delivering a ten‑fold throughput increase.
Accelerating Multimodal Massive Models for Short‑Video Applications
Addressing challenges of long training times, low inference efficiency, and complex deployment, Kuaishou developed a comprehensive solution encompassing mixed‑parallel training, inference optimization, and streamlined model deployment. Deployed across recommendation, advertising, search, and e‑commerce, these multimodal models achieve significant business gains with modest resource costs.
Overall, Kuaishou’s innovations demonstrate how AI‑driven video quality, digital‑human interaction, and large‑model acceleration can be integrated into a high‑traffic short‑video platform.
Kuaishou Large Model
Official Kuaishou Account
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.