Artificial Intelligence 18 min read

Causal Inference for Optimizing Advertising Budget Allocation in Fliggy Search CPC Ads

This article explains how causal inference techniques are applied to model the uplift effect of ad placement in Alibaba's Fliggy search CPC advertising, transforming budget allocation into a multi‑objective optimization problem and describing practical control methods, feature engineering, sample re‑sampling, model designs, uplift evaluation, and future research directions.

DataFunTalk
DataFunTalk
DataFunTalk
Causal Inference for Optimizing Advertising Budget Allocation in Fliggy Search CPC Ads

In Fliggy's search CPC advertising, budget allocation must consider not only CPC spend and advertiser ROI but also the platform's overall revenue (CPC spend plus organic transaction commission). Traditional strategies based on pCTR, pCVR, and bid optimize only from the ad exposure perspective and cannot accurately tune the platform‑wide objective.

By introducing causal inference, the ad placement is modeled as an intervention, directly predicting the uplift effect of showing an ad on business goals. This uplift serves as a linear reward or constraint for downstream optimization, supporting various online strategies.

1. Budget Allocation as a Multi‑Objective Optimization Problem

Platform monetization efficiency (eCPM)

Ad click‑through rate (CTR ad )

Ad conversion rate (CVR ad )

Advertiser revenue (GMV ad )

Platform profit (revenue ad )

Constraints include total advertiser budget, ROI of the delivery plan, CTR thresholds, GMV thresholds, and profit thresholds. Different businesses, users, and products may have distinct goals and constraints, requiring tailored factor control.

2. Common Control Algorithms and Targets

PID controller (traditional engineering method)

Dual method (linear programming to obtain shadow costs for online control)

Linear interpolation (fit relationship between control variables and targets using historical data)

All methods rely on linear factors, so effective control demands high‑quality linear features.

3. Causal Inference Modeling

The core idea is to model the effect of the binary treatment variable T (whether an ad is shown) on a platform KPI Y (e.g., GMV). Two implementation paths are discussed:

Random experiment (assume T is randomly assigned)

Feature engineering (assume all confounders are captured in X )

When these assumptions hold, the expected uplift can be estimated directly from observational data.

4. Sample Structure Challenges

For causal inference, the treated and control groups must have similar distributions; otherwise bias accumulates. This is stricter than ordinary prediction tasks, which only require overall unbiasedness.

5. Feature Engineering

All potential confounders—features influencing both ad placement and platform efficiency—are collected, often by extending existing CTR/CVR models. A multi‑task learning framework shares representations between native search samples and ad samples, with dynamic in‑batch re‑sampling to balance their proportions.

6. Sample Re‑sampling Techniques

Propensity‑Score Weighting/Matching (hard to estimate accurately in practice)

Original‑Space Matching (directly match samples in feature space, e.g., K‑NN on page‑level CTR vectors)

These methods construct comparable treated and control groups without running online A/B tests.

7. Model Design for Uplift

Direct Method with shortcuts (ResNet‑style architecture separating user conversion modeling and ad‑effect modeling, then linearly combining them)

Domain‑Adaptation Multi‑Task Learning (treat ad‑on/off as separate tasks, sharing a common representation)

Effect‑Net (explicitly model the treatment effect by adding the uplift to the control‑group prediction in the final layer)

Experiments show that incorporating sample matching or explicit uplift modeling improves the uplift Qini score, even when AUC remains similar.

8. Uplift Evaluation

Uplift is evaluated via quantile (Qini) analysis: samples are bucketed by predicted uplift, and the average label difference between treated and control within each bucket is computed. The cumulative curve resembles an ROC; a larger area under the curve indicates better uplift prediction.

9. Future Directions

Exploring the relationship between causal reasoning and deep representation learning

Balanced Representation Learning to obtain confounder‑invariant embeddings while preserving causal effect estimation

Addressing the tension between distribution alignment and preserving true confounder effects

The article concludes with a call for community engagement and further research on causal inference for advertising strategies.

e-commerceadvertisingMachine Learningcausal inferencebudget allocationuplift modeling
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.