Sep 25, 2023 · Artificial Intelligence

LPR4M: A Large-Scale Multimodal Livestreaming Product Recognition Dataset and the RICE Cross‑View Semantic Alignment Model

This paper introduces LPR4M, a 4‑million‑pair multimodal dataset for livestreaming product recognition, and proposes the RICE model that combines instance‑level contrastive learning with patch‑level cross‑view semantic alignment, demonstrating state‑of‑the‑art performance on both LPR4M and MovingFashion benchmarks.

cross-view alignmentdeep learninglivestreaming

0 likes · 19 min read

LPR4M: A Large-Scale Multimodal Livestreaming Product Recognition Dataset and the RICE Cross‑View Semantic Alignment Model