CVPR 2026: Learning Camera Pose from 10M Unlabeled Driving Videos
LA‑Pose shows that a model can acquire accurate camera pose estimation for autonomous driving by self‑supervised pretraining on roughly ten million unlabeled driving video clips and fine‑tuning with only a small amount of high‑quality 3D annotations, achieving over 10% accuracy gains while drastically reducing labeling cost.
