Artificial Intelligence 25 min read

Deep Learning for Click‑Through Rate Prediction in 58.com Home‑Page Recommendation

This article details how 58.com leverages deep learning models such as DNN, Wide&Deep, DeepFM, DIN and DIEN, combined with extensive user‑behavior feature engineering, offline vectorization, and online TensorFlow‑Serving pipelines to improve home‑page recommendation click‑through rates and overall platform efficiency.

58 Tech
58 Tech
58 Tech
Deep Learning for Click‑Through Rate Prediction in 58.com Home‑Page Recommendation

Background – 58.com, the largest Chinese classified information platform, faces information overload across categories like housing, jobs, and vehicles. Traditional search is passive; recommendation can actively push personalized content, making accurate, efficient deep‑learning‑based ranking crucial for user experience and platform performance.

58 Recommendation System Overview – The system consists of a data‑algorithm layer, business‑logic layer, and external‑interface layer. Candidate items are recalled using user interest and behavior history, then scored and ranked by deep learning models before being displayed in various tabs (home‑page "Guess You Like", category pages, detail pages, etc.).

Industry Technical Path – The article reviews seminal models: YouTube‑DNN (2016), Google Wide&Deep, Huawei DeepFM, Alibaba DIN, and Alibaba DIEN, highlighting how each introduced embeddings, attention, and interest‑evolution mechanisms that inspire 58’s own solutions.

Model Design for 58 Home‑Page – The architecture follows an Embedding & MLP pattern. Three ID features (post ID, category ID, region ID) are embedded, combined with attention to weigh historical interests, and processed by an AUGRU layer to capture dynamic interest evolution. Additional dense and categorical features are concatenated before feeding into the deep network.

Feature Engineering – Basic features include post attributes (category, region, historical clicks/CTR, Word2Vec vectors) and user attributes (region, interest embeddings). User‑post interaction features (region‑region match, interest‑category match, etc.) are also constructed. Offline vectorization uses Word2Vec trained on recent 7‑day click logs, with careful negative‑sampling strategies to generate balanced training data.

Offline Sample Construction & Training – Billions of samples are stored in HDFS via MapReduce. TensorFlow Dataset APIs read the data for single‑machine or distributed training of DNN, DIN, and DIEN models. Training monitors accuracy and loss via TensorBoard; batch‑normalization parameters are correctly exported for serving.

Online Ranking Service – Real‑time user behavior is streamed from Kafka to Redis clusters. A batch‑processing strategy splits each ranking request (≈120 posts) into multiple batches to reduce TensorFlow‑Serving QPS. Latency tests show batch‑size 20 yields <22 ms per request, meeting production constraints.

Experimental Results – Offline AUC: GBDT 0.634, DIN 0.643, DIEN 0.651. Online A/B tests reveal that DIN improves click‑through rate by 13.06 % and exposure‑to‑conversion by 16.16 %; DIEN further raises conversion by 17.32 %, surpassing the best GBDT baseline. Additional studies on real‑time feature latency, Word2Vec usage, and attention mechanisms confirm the importance of timely, dynamic user‑interest modeling.

Conclusion & Outlook – By integrating deep learning with rich user‑behavior data, 58.com achieved significant gains in recommendation relevance and platform efficiency. Future work includes distributed offline training, more advanced attention structures, graph embeddings, and exploration of reinforcement or transfer learning for further performance improvements.

Deep LearningCTR predictionrecommendation systemA/B testingembeddingattention mechanismonline ranking
58 Tech
Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.