Kuaishou’s CVPR 2021 Paper Highlights: 3D Vision, Domain Adaptation, Point Cloud Completion, Video Segmentation, and Face Forgery Detection
Kuaishou secured 14 accepted papers at CVPR 2021, spanning 3D hand mesh recovery, unsupervised keypoint detection, point cloud completion, modular interactive video segmentation, deep video matting, co‑salient object detection, occlusion‑aware instance segmentation, semantic image matting, and face forgery detection, showcasing the maturity of its research collaborations.
Kuaishou secured 14 accepted papers at CVPR 2021, covering a broad spectrum of computer‑vision topics such as 3D hand mesh recovery, unsupervised keypoint detection, point‑cloud completion, modular interactive video object segmentation, deep video matting, co‑salient object detection, occlusion‑aware instance segmentation, semantic image matting, and face forgery detection.
1. Camera‑Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D‑1D Registration – This work introduces semantic aggregation and multi‑dimensional registration to reconstruct hand meshes directly in camera space, addressing 2D‑to‑3D reconstruction without multi‑view images or 3D sensors.
Paper link: https://arxiv.org/abs/2103.02845
2. Regressive Domain Adaptation for Unsupervised Keypoint Detection – The paper tackles the high cost of labeled data by adapting models trained on a labeled source domain to an unlabeled target domain, revealing that prediction errors concentrate on specific keypoint regions, which motivates a regression‑to‑classification reformulation.
Paper link: https://arxiv.org/abs/2103.06175
3. Cycle4Completion: Unpaired Point Cloud Completion using Cycle Transformation with Missing Region Coding – Inspired by CycleGAN, this method learns a bidirectional transformation between incomplete and complete point clouds using a cycle‑consistency loss and asymmetric shape constraints, achieving SOTA performance on the 3D‑EPN dataset without paired supervision.
Paper link: https://arxiv.org/abs/2103.07838
4. PMP‑Net: Point Cloud Completion by Learning Multi‑step Point Moving Paths – PMP‑Net bypasses direct shape prediction by iteratively moving points from a partial cloud toward a complete shape, addressing the difficulty of generating unordered point sets with generative networks.
Paper link: https://arxiv.org/abs/2012.03408
5. Modular Interactive Video Object Segmentation: Interaction‑to‑Mask, Propagation and Difference‑Aware Fusion – The algorithm decomposes interactive video object segmentation into three decoupled modules, enabling efficient mask generation with minimal user interaction and achieving state‑of‑the‑art results on the DAVIS benchmark.
Paper link: https://arxiv.org/abs/2103.07941
6. Deep Video Matting via Spatio‑Temporal Alignment and Aggregation – This two‑stage framework propagates a sparse trimap across frames and aggregates temporal information without computing optical flow, delivering high‑quality video matting and introducing a large synthetic video‑matting dataset.
Paper link: https://arxiv.org/abs/2104.11208
7. Group Collaborative Learning for Co‑Salient Object Detection – By incorporating category‑conditioned information during training, the method improves discrimination among co‑salient objects across image groups, outperforming previous approaches on standard benchmarks.
Paper link: https://arxiv.org/abs/2104.01108
8. Deep Occlusion‑Aware Instance Segmentation with Overlapping BiLayers – BCNet models each Region of Interest as two overlapping layers (occluder and occluded) to explicitly handle heavy occlusions, achieving significant gains on COCO and KINS datasets.
Paper link: https://arxiv.org/abs/2103.12340
9. Semantic Image Matting – The work formulates image matting as a soft‑segmentation problem, leveraging deep CNNs to predict alpha mattes from RGB images and trimaps, and introduces a large synthetic dataset to advance the field.
Paper link: https://arxiv.org/abs/2104.08201
10. Frequency‑aware Discriminative Feature Learning Supervised by Single‑Center Loss for Face Forgery Detection – To improve deep‑fake detection, the authors propose a single‑center loss that enforces intra‑class compactness and inter‑class separability, achieving superior performance on mixed real‑and‑synthetic face datasets.
Paper link: https://arxiv.org/abs/2103.09096
Kuaishou Tech
Official Kuaishou tech account, providing real-time updates on the latest Kuaishou technology practices.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.