Artificial Intelligence 16 min read

CausalMMM: Learning Causal Structure for Marketing Mix Modeling

CausalMMM introduces an encoder‑decoder framework that automatically discovers heterogeneous, interpretable causal graphs among advertising channels while modeling temporal decay and saturation, using Granger‑based variational inference, and achieves over 5.7% improvement in causal structure learning and significant GMV prediction gains on Alibaba’s data.

Alimama Tech
Alimama Tech
Alimama Tech
CausalMMM: Learning Causal Structure for Marketing Mix Modeling

Abstract : Marketing Mix Modeling (MMM) is widely used to predict total GMV and allocate advertising budgets. Traditional regression‑based MMM struggles in complex scenarios, and existing causal MMM approaches assume a fixed, known causal graph. This work defines a new causal MMM problem that automatically discovers interpretable causal structures from data and improves GMV prediction. Two key challenges are addressed: (1) causal heterogeneity across advertisers and (2) marketing response patterns such as decay and saturation. The proposed CausalMMM integrates Granger causality into a variational inference framework, learns channel‑wise causal graphs, and predicts GMV under normalized temporal and saturation response constraints. Experiments on Alibaba’s real data show >5.7% improvement in causal structure learning and significant GMV prediction gains.

Background : Alibaba’s advertising ecosystem includes search, display, live, short video, and brand channels. Advertisers need to allocate budgets across these channels to maximize GMV and ROI. Traditional MMM either regresses channel spend on GMV, ignoring inter‑channel interactions, or relies on pre‑defined causal graphs that cannot capture heterogeneity among stores. Hence, dynamic causal discovery is essential.

Method : CausalMMM is an encoder‑decoder architecture. The Causal Relational Encoder encodes historical spend and targets into edge representations on a fully connected graph, uses a GNN to aggregate global information, and applies Gumbel‑Softmax sampling to obtain a discrete causal adjacency matrix. The Marketing Response Decoder models two response patterns: (a) temporal carry‑over effects using an RNN‑augmented GNN, and (b) saturation effects via a learnable S‑curve (Hill function) parameterized by neural networks for α and γ. Variational inference optimizes the evidence lower bound, combining data likelihood and KL divergence to a causal‑graph prior.

Experiments : The model is evaluated on synthetic data with known causal graphs and on Alibaba’s real advertising dataset. CausalMMM is compared against seven baselines, including Granger‑based causal methods (Linear Granger, NGC, GVAR, InGRA) and standard MMM predictors (LSTM, Wide&Deep, BTVC). Results show that CausalMMM consistently outperforms baselines in causal discovery accuracy and GMV prediction MSE across different forecast horizons (1, 7, 30 steps). Ablation studies (CM‑FULL, CM‑MARKOV, CM‑RW) confirm the contribution of each component.

Conclusion : CausalMMM jointly learns heterogeneous causal structures and marketing response patterns, delivering superior GMV forecasts and interpretable channel relationships. The approach bridges causal discovery and MMM, offering a practical solution for large‑scale e‑commerce advertising optimization.

time series forecastingGraph Neural NetworksCausal Inferencemarketing mix modelingvariational inference
Alimama Tech
Written by

Alimama Tech

Official Alimama tech channel, showcasing all of Alimama's technical innovations.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.