Artificial Intelligence 13 min read

Causal Machine Learning for User Growth: Concepts, Methods, and Applications

This article explores how combining causal inference with machine learning can detect subtle correlations in large datasets, improve user growth metrics such as retention and activity, and presents practical methods like propensity score matching, uplift modeling, HTE analysis, and meta‑learners applied to recommendation systems.

DataFunTalk
DataFunTalk
DataFunTalk
Causal Machine Learning for User Growth: Concepts, Methods, and Applications

In this talk, guest speaker Li Ao (PhD, Kuaishou) discusses the integration of causal reasoning and machine learning to address subtle correlations in massive data and evaluate predictive accuracy for user growth.

1. Basic Concepts – User growth metrics focus on DAU, retention, activity, and market-driven indicators such as payment and viral spread.

2. Causal Analysis Methods – Propensity Score Matching (PSM) is introduced to answer "why" questions, while causal machine learning and attribution techniques (Uplift, Meta‑learner, Causal Recommendation) address the "how".

3. PSM Procedure – (1) Build a treatment model (e.g., LR/XGBT) to estimate propensity scores; (2) Match treatment and control groups to remove bias; (3) Perform KS test to check covariate balance; (4) Compute Average Treatment Effect (ATE) to quantify impact, e.g., a 5% lift in retention from clicks.

4. Causal Machine Learning – Differentiates between using causal inference as a tool for ML and using ML for causal inference; discusses heterogeneous treatment effect (HTE) analysis for retention and activity, illustrating quadrant classification of treatment effects.

5. Game‑Coin Recovery Model – Presents a mathematical model comparing two recovery strategies (100 coins vs. 60 coins) and evaluates them using Meta‑learner and HTE approaches, highlighting the calculation of lift and optimal policy selection.

6. Causal Attribution Theory – Addresses challenges of multi‑treatment scenarios and delayed effects, proposing a framework that uses propensity scores and credit assignment to attribute lift to specific treatments.

7. Q&A Highlights – The causal approach is applied mainly in the re‑ranking stage of recommendation pipelines, can be extended to sorting, and benefits from random traffic for unbiased estimation.

Overall, the talk demonstrates how causal inference techniques can be systematically integrated into user growth strategies to improve retention, activity, and monetization.

machine learninguser growthRecommendation systemscausal inferencepropensity score matchinguplift modelingheterogeneous treatment effect
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.