Artificial Intelligence 15 min read

Applying Survival Analysis to User Activity Modeling: Concepts, Methods, and the KwaiSurvival Deep‑Learning Framework

This article explains why traditional DAU metrics are insufficient, introduces survival analysis fundamentals and key functions, demonstrates how Kaplan‑Meier curves can characterize user activity, and presents KwaiSurvival—a deep‑learning‑based survival modeling suite with DeepSurv, DeepHit and N‑MTLR models—for practical user‑engagement and churn‑prevention use cases.

DataFunTalk
DataFunTalk
DataFunTalk
Applying Survival Analysis to User Activity Modeling: Concepts, Methods, and the KwaiSurvival Deep‑Learning Framework

With the rapid development of internet services, conventional north‑star metrics such as DAU no longer capture the full picture of user behavior because they ignore the timing of events. The article shows how survival analysis can incorporate event timing to provide a richer understanding of user activity.

Why use survival analysis for user activity? Traditional regression models focus only on binary outcomes, while survival analysis models both the occurrence and the time to event, allowing analysts to quantify how quickly users return to an app.

Survival analysis basics include definitions of event, survival time, censoring, and risk set, as well as typical data characteristics (non‑negative, discrete, no missing values). Important functions such as the survival function, hazard function, and cumulative hazard are introduced, with the Kaplan‑Meier (KM) estimator highlighted as a non‑parametric method.

Applying the concepts the article maps user‑level events (app launches) to survival analysis terminology, shows how to compute risk sets, and illustrates the construction of retention, risk, and survival curves. Visual examples compare regions with different activity patterns and demonstrate geographic clustering of user survival probabilities.

Modeling user activity involves fitting Cox proportional hazards models (including Anderson‑Gill counting processes for recurrent events) and evaluating them with concordance rather than AUC. Feature groups such as demographics, consumption behavior, and recommendation strategy are used, and SHAP values are employed to rank feature contributions both globally and locally.

KwaiSurvival framework is a Python‑based deep‑learning survival analysis library developed by Kwai. It integrates three models: DeepSurv (neural‑network Cox), Neural Multitask Logistic Regression (N‑MTLR), and DeepHit (enhanced with ResNet and rank loss). Comparative experiments on 70 W records with 74 features show that DeepHit outperforms traditional Cox in concordance while maintaining similar training time.

The source code and models are publicly available on GitHub (https://github.com/kwaiDA/KwaiSurvival/tree/kwaiDA-liuziyue), inviting the community to contribute additional survival‑analysis tools.

deep learningsurvival analysisCox regressionKM curveKwaiSurvivaluser activity
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.