AR Foot Measurement and Hand Try-On Algorithms for Mobile Vision
The article presents a mobile‑vision solution that combines lightweight detection, line detection, segmentation and 3‑D point‑cloud reconstruction to measure foot length within 3 mm error, and a MANO‑based hand‑try‑on system that predicts full mesh vertices for real‑time watch, phone and ring fitting on smartphones.
With the rise of smartphone computing power, AR/VR applications such as AR foot measurement and hand try‑on have attracted significant attention due to their broad commercial potential.
This article introduces the AR foot‑measurement solution implemented in the PixelAI mobile vision library. The pipeline fuses target detection, line detection, image segmentation and 3‑D point‑cloud reconstruction to achieve foot‑length errors within 3 mm while maintaining a smooth user experience.
Key components include:
Foot detection using a lightweight NanoDet‑Plus model, ensuring the foot is present, barefoot, and positioned in the designated U‑shaped region.
Ground‑plane reconstruction from LiDAR depth maps via RANSAC, providing a stable reference plane.
Line detection (customized MLSD/TP‑LSD) to automatically align the ground reference line with the screen guide.
Foot segmentation using U2‑Net to obtain accurate masks for 3‑D point‑cloud generation.
3‑D point‑cloud reconstruction and bbox calculation to derive foot length and width, followed by a calibration step that refines the bounding box using five adjustable points.
The UI design employs an orange/green U‑shaped overlay and concise GIF tutorials to guide users, reducing mis‑operations and improving measurement speed.
The article also covers AR hand‑try‑on techniques, including:
Hand 3‑D reconstruction based on the MANO model, with two design families: indirect pose/shape parameter regression and direct mesh vertex regression. The chosen approach directly predicts 768 mesh vertices for higher fidelity.
A lightweight network architecture that combines a residual backbone, a 2‑D keypoint branch, and a camera‑parameter branch to enable real‑time inference on mobile devices.
AR layer algorithms for watch, phone and ring try‑on. Watch and phone try‑on use wrist or palm keypoints with PnP to compute rotation and translation matrices. Ring try‑on leverages 3‑D finger‑joint coordinates and a custom look‑rotation strategy to handle occlusions and multi‑camera setups.
Experimental results show that the proposed methods achieve comparable or better accuracy than several server‑side SOTA models while meeting real‑time constraints on typical mobile hardware.
The paper concludes with a discussion of the future potential of AR in e‑commerce, education, and healthcare, and lists relevant references.
DaTaobao Tech
Official account of DaTaobao Technology
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.