Artificial Intelligence 19 min read

CSCNN: Category‑Specific Convolutional Neural Network for Visual CTR Prediction in JD E‑commerce Advertising

This article presents CSCNN, a category‑specific convolutional neural network that integrates visual priors into click‑through‑rate (CTR) models for JD.com’s e‑commerce advertising, detailing its motivation, architecture, engineering optimizations, offline and online training strategies, and empirical performance gains on both public and industrial datasets.

DataFunTalk

Sep 2, 2020

CSCNN: Category‑Specific Convolutional Neural Network for Visual CTR Prediction in JD E‑commerce Advertising

JD.com’s search advertising platform relies heavily on CTR models to rank ads; with the massive influx of visual content, leveraging image information has become a new trend. The talk introduces CSCNN, a next‑generation ad‑ranking model that incorporates visual cues into CTR prediction.

The presentation first outlines the background of JD’s 9NAI platform, the challenges of optimizing eCPM in e‑commerce, and the four‑fold feature space (query, user, product, context) used in CTR modeling. It then discusses the limitations of traditional CNNs for this domain, such as weak supervision, overfitting on sparse features, and engineering bottlenecks.

To address these issues, the authors propose a multi‑modal feature pipeline: manual features, text features, user‑side interaction features, and image features. They highlight the need for visual priors—category information that can guide CNN learning—so that the network focuses on category‑relevant details and avoids irrelevant background noise.

CSCNN builds on a category‑specific attention mechanism similar to SENet. For each convolutional layer, channel‑wise and spatial‑wise attention modules receive both the feature map and a category embedding, producing refined feature maps that are biased toward the given product category. This design effectively turns the CNN into a category‑aware extractor.

Engineering solutions are described to mitigate training and serving costs: offline pre‑computation of image embeddings, aggregation of identical product requests, synchronized multi‑GPU updates, and a lookup‑table‑based online serving that reduces latency to sub‑20 ms on CPU.

Extensive experiments on a public Amazon dataset and JD’s massive industrial logs (hundreds of billions of samples, thousands of categories) demonstrate significant AUC improvements over baseline models, late‑fusion approaches, and other attention‑based methods. The CSCNN model has been deployed at scale, serving billions of users daily.

In conclusion, integrating category‑specific visual priors into CNNs substantially boosts CTR prediction performance in e‑commerce advertising, and the proposed system architecture enables practical, large‑scale deployment.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Machine Learning deep learning CTR prediction e-commerce advertising category-specific CNN visual modeling

Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.