Artificial Intelligence 16 min read

Automated Image Review System for Second‑Hand Product Listings on ZhiZhuan Platform

This article describes how ZhiZhuan’s B2C marketplace implemented an automated image review system using computer‑vision techniques such as image matching, regression and detection to verify product‑image consistency, clarity, anti‑tamper labels, cleanliness and centering, achieving a 50% reduction in manual workload.

Zhuanzhuan Tech
Zhuanzhuan Tech
Zhuanzhuan Tech
Automated Image Review System for Second‑Hand Product Listings on ZhiZhuan Platform

1. Product Review Background Introduction

ZhiZhuan is an e‑commerce platform focused on second‑hand goods. Depending on the transaction party, it supports C2C, C2B and B2C models; for example, individual users sell on the free market (C2C), the company offers mail‑in and on‑site collection for phones (C2B), and it also provides official verification, warranty and after‑sale services for B2C items. This article focuses on the image‑algorithm application in the B2C product‑listing review process.

Because second‑hand items are non‑standard, even items with the same SKU can vary in condition. To improve user experience and transparency, ZhiZhuan displays real photos rather than rendered images, which raises the need to verify the accuracy and quality of each product’s display images.

In the early stage, all listing images were manually inspected for:

Whether the displayed image matches the actual product, avoiding “wrong‑item” situations;

Whether the photo is clear, as batch shooting sometimes leads to focus errors;

For electronics such as phones and tablets, ensuring no dirt and that anti‑tamper stickers are applied to key areas;

Proper cropping so that the product is centered in the app display.

As the business grew, the volume of daily listings increased and manual review showed several drawbacks:

Review work is repetitive and tiring, leading to higher error rates;

Clarity judgments are subjective, making it hard to align standards across reviewers;

Manual throughput cannot keep up with listing volume, creating a bottleneck.

To address these repetitive tasks, we applied computer‑vision techniques—classification, regression, and detection—to assist human judgment, improving both accuracy and efficiency.

2. Automated Review Solution

The review covers the following items:

Consistency between product display image and corresponding SKU;

Image clarity;

Presence of anti‑tamper stickers;

Presence of dirt;

Whether the product is centered in the image.

Item to Review Solution Image‑SKU consistency Image matching Image clarity Regression approach Anti‑tamper sticker Detection approach Dirtiness Detection approach Centering Detection approach

2.1 Image‑SKU Consistency

During listing, mismatches can occur—for example, the SKU says “iPhone 11 red” but the displayed photo is an “iPhone X green”. This is essentially an image‑classification problem, but a plain classifier has two major issues:

The set of categories is fixed; the model must always output one of them even if the input image belongs to a new SKU. Fixed categories prevent handling newly added SKUs.

To overcome these limitations we switched to an image‑matching strategy. A strong feature extractor is trained, then similarity between the query image and a gallery of SKU images is computed. This follows research in face recognition, person re‑identification, and image retrieval. Traditional features (SIFT, SURF, ORB) and deep‑learning features (CNNs) are considered.

Training stage: We jointly train a classification network with cross‑entropy loss and triplet loss. Backbone candidates included MobileNet, ResNet, ShuffleNet, and OSNet; ResNet gave the best accuracy and was selected.

Cross‑entropy is a standard classification loss, while triplet loss (common in face and person re‑identification) encourages embeddings of the same class to cluster together and pushes different‑class embeddings apart. Combining both accelerates convergence and improves precision.

Testing stage: The trained backbone extracts embeddings for query images; cosine similarity with gallery embeddings yields a ranked list. The top‑1 result provides the predicted SKU, and because each SKU has three gallery images, we apply K‑NN on the top‑5 list to determine the final SKU.

Online deployment: The system outputs the similarity score of the top‑1 result; if the score falls below a predefined threshold, an alarm is raised for manual review, ensuring SKU accuracy.

2.2 Image Clarity

Blurred photos arise from motion or mis‑focus, leading to rejected listings. Human reviewers often disagree on borderline cases, causing inconsistency. To standardize, we define three blur levels—obvious blur, slight blur, and clear—assigning scores 2, 1, and 0 respectively. Multiple annotators score each image, discard contradictory extreme cases, normalize the remaining scores, and obtain a blur score.

Image Annotator 1 Annotator 2 Annotator 3 Total (0‑6) Normalized Score Image 1 Obvious blur Slight blur Obvious blur 5 5/6 = 0.83 Image 2 Slight blur Slight blur Obvious blur 4 4/6 = 0.67 Image 3 Clear Slight blur Clear 1 1/6 = 0.17

We train a convolutional neural network and replace the classification loss with a regression loss (MSE). The model outputs a continuous blur score, decoupling algorithm development from business standards; the business can adjust the blur‑threshold as needed.

2.3 Anti‑Tamper Sticker, Dirtiness, and Centering

Detection models address these three items. Anti‑tamper stickers have simple visual features, making detection straightforward. Centering detection is also easy because the object occupies a large area. Dirt detection is harder due to small, rare dirty spots. We employ active learning: an initial model flags low‑confidence samples from unlabeled data, humans label them, and the model is retrained iteratively, eventually reaching near‑human performance.

2.4 Algorithm Application Strategy

In computer‑vision tasks, precision and recall cannot both reach 100 %. We adopt a high‑recall strategy, tolerating lower precision because false‑positives are sent to manual review, which does not harm the business.

With the algorithm in place, manual workload for listing review has been cut by 50 %. About half of the images pass automatically; the rest are flagged for human re‑inspection.

3. Summary

We introduced the background of product‑listing review, the challenges of manual inspection, and the benefits brought by algorithmic assistance.

We then detailed three algorithmic solutions—image matching for SKU consistency, regression for blur assessment, and detection for stickers, dirt, and centering—explaining the high‑recall deployment strategy and its successful impact on operational efficiency.

ZhiZhuan R&D Center and industry partners share practical experience and cutting‑edge topics on the technical exchange platform. Follow the public accounts “ZhiZhuan Technology” (general), “Big ZhiZhuan FE” (frontend), and “ZhiZhuan QA” (quality) for more hands‑on content.
computer visiondeep learningAutomationImage Recognitionproduct verification
Zhuanzhuan Tech
Written by

Zhuanzhuan Tech

A platform for Zhuanzhuan R&D and industry peers to learn and exchange technology, regularly sharing frontline experience and cutting‑edge topics. We welcome practical discussions and sharing; contact waterystone with any questions.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.