Artificial Intelligence 13 min read

Cupid Push Control System: Machine‑Learning‑Driven Notification Optimization at 58.com

The article details how 58.com’s Cupid push control system leverages machine‑learning models, especially XGBoost‑based CTR prediction, to prioritize and filter billions of daily push notifications, improving click‑through rates, reducing user annoyance, and providing a scalable, data‑driven architecture for diverse business services.

58 Tech
58 Tech
58 Tech
Cupid Push Control System: Machine‑Learning‑Driven Notification Optimization at 58.com

Background

Push notifications are a crucial channel for reminding or awakening users, playing a key role in app operations. While effective push can help product operators achieve goals efficiently, misuse can annoy users and cause churn.

With the development of the 58.com app, the push mechanism evolved from fixed‑time, mass‑broadcast messages (e.g., daily 10 AM real‑estate push) to personalized, machine‑learning‑driven recommendations.

In 2015, a "Guess‑You‑Like" push system based on big data and machine learning was introduced, achieving up to 20 % daily active click share. By 2016, multiple product lines added their own personalized pushes, leading to competition for the same channel resources.

In 2017, the "Cupid" push control system was developed, using algorithmic models to allocate push resources and improve effectiveness.

Algorithm Model

The goal of Cupid is to raise push click‑through rate (CTR), a typical binary classification problem. XGBoost is used as the CTR prediction model; after selecting appropriate evaluation metrics, features are continuously enriched and the model optimized.

Sample and Feature Processing

Backend logs record push events and user click behavior. By joining these logs, push click status is obtained and combined with various features to form the training dataset, consisting of positive samples (push + click) and negative samples (push + no click). The data is converted to LIBSVM format for XGBoost.

Key feature groups include:

User profile features: device type, car ownership, age, gender, etc.

Push message features: business ID, copy length, presence of numbers, etc.

Context features: push time, phone type, user activity on the previous day, etc.

Statistical features: historical push count, historical click‑through rate, etc.

Low‑frequency features are filtered out to accelerate training.

Model Selection

Cupid sets XGBoost’s objective to binary:logistic , outputting the probability of a positive class. The resulting scores are used to filter low‑quality users per business and to rank pushes across business lines.

Evaluation Metrics

The primary metric is batch recall rate, defined as the proportion of retained users who click among all users who click in a batch. An overall recall rate is also defined to evaluate the system globally, similar to AUC but more intuitive for the online scenario.

Online metrics focus on push click‑through rate and push daily‑active‑user (DAU) share.

Model Training and Online Prediction

Every day, the past seven days of data are used to train a new model offline, which is then deployed online. Each new model release undergoes A/B testing against the existing model.

Engineering Architecture

Cupid processes billions of push messages daily, requiring a stable, reliable, and scalable architecture covering push flow, system design, A/B testing, data monitoring, and data construction.

Push Flow

All business pushes are collected via the WMB message bus into a message pool, where they are scored, filtered, and ranked before being delivered to users according to configured strategies (real‑time and scheduled pushes).

System Architecture

The system follows a three‑layer design: access layer, logic layer, and data layer.

Access layer: proxy services receive raw push messages via WMB and a configuration platform manages push policies.

Logic layer: scheduling service executes dispatch strategies, prediction service scores, sorts, and filters messages, and a monitoring center oversees service health.

Data layer: stores push messages, user profiles, click logs, models, and configuration data.

AB Testing

AB testing is integrated to allow new business lines to be gradually rolled out without affecting existing pushes, and to support simultaneous online experiments of multiple prediction models.

Data Monitoring

A three‑dimensional monitoring system (real‑time and offline) tracks service status, message throughput, third‑party interfaces, and other key indicators.

Data Construction

To close the data loop from push to click to conversion, a unified serial number is embedded in all events, enabling multidimensional analysis via Kylin across time, client, business, and algorithm dimensions.

Summary

The Cupid push control system redesigns the original push workflow and strategy, applying machine‑learning models to filter low‑quality users and select the best push for each user, thereby reducing disturbance, enhancing user experience, and significantly increasing the effectiveness of each push message.

Since full rollout, push click‑through rate has risen by 200 %, push click‑through DAU share by 15.7 %, and negative feedback has dropped by 85 %, boosting overall user activity and retention for the 58.com app.

System ArchitectureAB testingMachine Learningpush notificationsCTR predictionXGBoost
58 Tech
Written by

58 Tech

Official tech channel of 58, a platform for tech innovation, sharing, and communication.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.