Laser: Latent Surrogate Representation Learning for Long-Term Effect Estimation in Ride-Hailing Markets
Laser (Latent Surrogate Representation learning) estimates long‑term ride‑hailing market effects by inferring hidden surrogate variables from short‑term outcomes using an iVAE and inverse‑probability weighting, thereby reducing experiment cost and latency while achieving more accurate causal effect predictions than existing baselines.
In the two‑sided ride‑hailing market, quantitative strategies play a crucial role in balancing supply and demand. Their impact can be divided into short‑term value (immediate intervention on current supply‑demand) and long‑term value (intervention that influences future supply‑demand).
From a technical perspective, the strategy is treated as a treatment variable (t) and its impact on the market as a treatment effect (y). The model that estimates the relationship between treatment and effect is called an uplift model.
Short‑term effect can be estimated with conventional causal inference models such as DML or GRF. Long‑term effect estimation, however, usually requires costly and time‑consuming long‑term experiments, which suffer from two problems: lack of timeliness and high cost.
To address these challenges, the Didi MPT team collaborated with Prof. Cai Ruichu from Guangdong University of Technology and proposed a method that uses short‑term outcomes as surrogate indices to evaluate long‑term value. The method, named Laser (LAtent SurrogatE Representation learning), builds a model that learns latent surrogate variables from short‑term data, thereby improving the timeliness and reducing the cost of long‑term effect estimation.
Surrogate Index Background
The ideal situation is to conduct long‑term experiments, but due to cost constraints we must infer long‑term effects from short‑term data. We therefore use short‑term outcome variables as surrogate indices (observed surrogate So and latent surrogate Sl) to estimate long‑term outcomes.
Two main difficulties arise: (1) latent surrogates Sl are unobservable and cannot be used as features; (2) it is hard to distinguish observable surrogates So from proxy variables p.
The Laser method is designed to solve these problems and estimate long‑term causal effects.
Model Assumptions
Standard causal inference assumptions: SUTVA, Overlap, Unconfoundedness.
Comparability Assumption: p(y|x,so,sl) = p_exp(y|x,so,sl).
Partially Latent Surrogacy Assumption: y ⟂ t | so, sl, x.
Model Structure
The framework consists of two stages:
Representation Learning Stage : An iVAE is used to recover the latent surrogate sl. The inference network takes treatment t, covariates x, and short‑term outcomes (including observable surrogate so and proxy p) as inputs. The variational posterior q(s|x,t,m) is parameterized by μ̂ and σ̂ learned by the network.
Effect Estimation Stage : Inverse‑Probability‑Weighting (IPW) is applied to estimate the long‑term causal effect. The predicted long‑term outcome ŷ is obtained from a neural network, and the propensity score e(x)=E(t|x) is used for weighting.
The generative network (MLP) models the conditional distributions p(m|x) and p(y|s,x) as Gaussian, with parameters learned from data.
The overall objective combines the ELBO for the VAE components and a log‑likelihood loss for the long‑term outcome y.
Evaluation
The average MAPE of the causal effect estimate is used as the evaluation metric.
Laser is compared with two baselines, "Sind‑Linear" and "Sind‑MLP", across several offline datasets. The results show consistent improvements in long‑term effect estimation.
Conclusion and Outlook
By leveraging short‑term surrogate indices and a dedicated model architecture (Laser), the method addresses the timeliness and cost challenges of long‑term effect estimation. Offline experiments validate its effectiveness, and future production deployments will be shared as the approach matures.
Didi Tech
Official Didi technology account
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.