Artificial Intelligence 9 min read

Kuaishou Showcases AI Innovations at CVPR 2024: Competitions, Large‑Model Demonstrations, and Research Highlights

At CVPR 2024 in Seattle, Kuaishou presented its latest AI research and applications, including a star‑studded gala, a short‑video quality competition, large‑model video generation demos, a multi‑dimensional text‑to‑image evaluation paper, and advanced video processing technologies, underscoring its strong ties with the academic community.

Kuaishou Tech
Kuaishou Tech
Kuaishou Tech
Kuaishou Showcases AI Innovations at CVPR 2024: Competitions, Large‑Model Demonstrations, and Research Highlights

From June 17‑21, 2024, the Computer Vision and Pattern Recognition Conference (CVPR) took place in Seattle with a record‑breaking attendance of over 12,000 participants. As a silver sponsor, Kuaishou used the venue to showcase its newest research achievements and practical applications in computer vision.

The company co‑hosted the "CVPR 2024 Star‑Gazing Elite Gala" together with partners HiDream.ai, Future, and GirlUp. During the opening speech, Kuaishou Senior Vice President and R&D line leader Yu Bing emphasized the importance of deep collaboration between industry and global scholars to drive AI innovation.

The "Kuaishou Visual Quality" short‑video quality competition, jointly organized with the University of Science and Technology of China’s Intelligent Media Computing Lab, concluded at the CVPR 2024 NTIRE workshop. More than 200 teams participated over four months, and the top three winners were SJTU MMLab, IH‑VQA, and TVQE.

Kuaishou also unveiled its self‑developed large model "Kling" (可灵) at the conference. Kling offers Sora‑level video generation capabilities, allowing users to create artistic videos from static images (image‑to‑video) and to extend existing videos up to three minutes through a one‑click continuation feature. Over 370,000 users have already applied to use Kling.

The Kuaishou "Koutu" (Kolors) team presented their paper "Learning Multi‑dimensional Human Preference for Text‑to‑Image Generation". The work introduces a Multi‑dimensional Preference Score (MPS) built on CLIP with an added preference‑conditioning module, trained on a self‑collected dataset (MHP) containing nearly one million human preference selections. MPS improves evaluation across aesthetics, semantic alignment, detail quality, and overall assessment.

Kuaishou’s audio‑video team also highlighted advances in video processing and coding. They introduced the Kuaishou Visual Quality (KVQ) assessment tool based on multi‑path temporal networks and sparse temporal attention, the Kuaishou Enhancement Processing (KEP) suite for super‑resolution, denoising, de‑blurring, and color enhancement, and the Kuaishou Video Coding (KVC) encoder that balances quality and bitrate, especially under bandwidth constraints.

The conference also featured the AI virtual human "Guan Xiaofang", a fully AI‑driven avatar powered by Kuaishou’s "Kuaiyi" large model, automatic speech recognition, text‑to‑speech, and digital‑human generation technologies, enabling real‑time multimodal interaction with users.

Overall, CVPR 2024 accepted 329 papers on image and video synthesis and generation, reflecting the field’s rapid growth. Kuaishou’s demonstrations underscored its leadership in generative AI, with large models already deployed across its platform for audio‑video processing, foundational models, and broader business ecosystems.

computer visionAIGenerative ModelsCVPR 2024Large‑Scale Video Generationmultimodal interactionShort Video Quality Competition
Kuaishou Tech
Written by

Kuaishou Tech

Official Kuaishou tech account, providing real-time updates on the latest Kuaishou technology practices.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.