Multi‑Business Recommendation System Architecture and Optimization at 58.com
This article explains how 58.com designs a three‑layer recommendation system for its homepage, tackles challenges of multi‑business fusion, interest modeling, traffic allocation, and dynamic refresh, and presents a step‑by‑step optimization pipeline that improves CTR and diversity.
58.com operates a classification‑information platform covering housing, recruitment, second‑hand goods, and more, making multi‑business recommendation a major challenge.
Overall Architecture : The system is divided into three layers – an external interface layer for input/output and display, a business‑logic layer containing modules such as interest service, recall, and ranking, and a data‑algorithm layer handling data storage, recall sources, ranking models, and caching.
Key Challenges : (1) Accurately capturing user demand across strong, short‑term interests; (2) Balancing traffic distribution among high‑volume business categories; (3) Enhancing result diversity while maintaining relevance.
Interest Strategy (Steps 1‑10) :
Data extraction from click and conversion logs.
Data cleaning to fix duplicate or missing fields.
Interest classification into historical and real‑time interests.
Interest calculation for each category.
Time‑based interest decay (e.g., 7‑day decay).
Interest merging across time and category dimensions.
Noise removal for low‑weight or outdated interests.
Interest expansion (related and generic extensions).
Interest sorting to produce a final priority list.
This pipeline produces a comprehensive user interest profile that feeds downstream recall, filtering, and ranking.
Business Traffic Allocation : Instead of using raw ranking scores or a fixed global ratio, 58.com allocates slots according to each user’s interest weight per business (e.g., 60% recruitment, 20% housing). When a category’s candidate pool is insufficient, other categories fill the gap, achieving both personalization and overall traffic balance.
Dynamic Refresh Mechanism :
De‑weight frequently exposed items to promote freshness.
Introduce time decay so recent exposures drop faster than older ones.
Incorporate server‑side exposure logs to further diversify the feed.
These steps raise click‑through rate by up to 7% and improve diversity.
Lessons Learned : Understanding business context is crucial; balancing multiple optimization strategies requires careful trade‑offs; algorithmic improvements and strategic rules should be tightly coupled yet loosely dependent for rapid iteration.
DataFunTalk
Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.