Artificial Intelligence 15 min read

Customer Churn Prediction Using Machine Learning: Feature Engineering, XGBoost, and Model Fusion

This article explains how to predict customer churn by analyzing competition data, extracting and engineering features, applying GBDT‑based XGBoost models, tuning hyper‑parameters, and improving performance through bagging and stacking model fusion techniques.

Ctrip Technology

May 6, 2017

Customer Churn Prediction Using Machine Learning: Feature Engineering, XGBoost, and Model Fusion

The article introduces customer churn rate as a key business metric and describes how historical data can be modeled with machine‑learning methods to predict churn probability and identify influencing factors.

It first outlines the competition task: a binary classification problem where the label 1 indicates churn and 0 indicates retention. The evaluation metric requires at least 97% accuracy while maximizing recall, reflecting the business need to avoid losing any customers.

Data analysis reveals three groups of features besides id and label: order‑related attributes (e.g., booking and check‑in dates), user‑related attributes, and hotel‑related attributes (e.g., review count, star rating). Because user IDs are unavailable, a heuristic groups records with identical user‑related fields as belonging to the same user.

The dataset is split into a 2/3 training set and a 1/3 local test set without time‑based ordering. The workflow includes feature extraction, resampling, feature selection, and model ensemble.

Feature Engineering starts with missing‑value imputation (zero‑fill, with additional indicator columns for critical features). Categorical variables are one‑hot encoded, and polynomial transformations are applied where useful. Group‑level statistics (max/min per user) and clustering labels for users and hotels are added as new features.

Derived features such as the difference between visit date and actual check‑in date, and weekend indicators, are also created because they often have strong predictive power.

Model Principles and Hyper‑parameter Tuning focus on Gradient Boosted Decision Trees (GBDT) and its optimized implementation XGBoost. The article explains GBDT’s residual‑learning mechanism and how XGBoost adds regularization, second‑order Taylor expansion, column sampling, and sorted split finding to improve speed and prevent over‑fitting.

Key hyper‑parameters tuned via GridSearch include maximum tree depth, learning rate, and number of trees. The tuning is performed on the local validation set, and the best configuration is later used for the final ensemble.

Model Fusion uses bagging: five bootstrapped training subsets (preserving the original class ratio) are each used to train an XGBoost model with the tuned parameters. Predictions from the five models are averaged to obtain the final score. Stacking is mentioned as an alternative, but bagging is the method applied in the competition.

The article concludes with practical advice: thorough data analysis, careful feature engineering, modest hyper‑parameter tuning after features are fixed, and using model fusion as a final “kill‑shot” rather than an early step.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Model Fusion XGBoost Customer Churn

Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.