Applying Visual AI Techniques for Image Quality and Duplicate Detection in Xianyu Marketplace
By deploying large‑scale visual AI—including a ResNet‑101 classifier, ArcFace‑trained matching features, clustering‑based sub‑category refinement, and product‑level image indexing—Xianyu’s marketplace dramatically improves image quality, removes duplicates, enhances search relevance and feed diversity, and filters non‑compliant content.
Background: Xianyu, a free marketplace, receives millions of user‑uploaded images daily, leading to issues such as duplicate images, mismatched descriptions, low‑quality or non‑product pictures, and illegal or gray‑area content. These problems affect user experience, brand reputation and regulatory risk.
Key problems include image duplication, inconsistency between image and text, unsuitable image types, and violation images (e.g., erotic or humorous content used as primary pictures).
Visual AI can address these challenges. The article outlines three main technical directions: (1) building a large‑scale image classification model to learn Xianyu’s image distribution, (2) learning image‑matching features from the classification model, and (3) combining classification and feature learning to solve relevance and diversity issues.
Large‑scale image classification model: difficulties arise from low‑quality user images, ambiguous category definitions, noisy titles, and high annotation cost. The pipeline consists of (a) basic image feature learning with ResNet‑101 (softmax loss outperforms ArcFace), (b) clustering‑based sample construction to refine sub‑categories, and (c) training the final classifier (batch size 256, cosine learning rate with warm‑restarts, top‑1 accuracy 74%).
Image‑matching features: used for duplicate detection and same‑style product identification. DeepID‑style architecture with ArcFace loss is trained on clustered product images, achieving 95% same‑item recall and 79% SKU‑level recall.
Search relevance: product images are clustered by predicted top‑3 categories; the dominant category confidence is used to re‑rank results, improving click‑through rate for queries such as “锐鲨” and “詹姆斯”.
Feed diversity: image‑based category predictions break up overly concentrated results, especially when user‑defined categories differ but visual cues are similar (e.g., different “华为mate Xs” listings).
Duplicate removal: an offline pipeline builds a product‑level image index using product quantization over 1.2 billion images, matches new items against the index, and updates a KV store to de‑duplicate results at query time.
Non‑compliant product filtering: combines generic moderation models, OCR, and a curated violation‑image library to block pornographic, political, or sensational images while preserving legitimate items.
Conclusion: Visual techniques (classification, feature learning, retrieval) are essential for improving quality, relevance, diversity, and compliance in Xianyu’s massive user‑generated content, but they must be integrated with text signals, user feedback, and robust engineering pipelines.
Xianyu Technology
Official account of the Xianyu technology team
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.