Sales Forecasting in Alibaba Health's Pharmaceutical E‑commerce: Business Background, Algorithm Solutions, and Scenario Exploration
The article details a comprehensive presentation on Alibaba Health's pharmaceutical e‑commerce sales forecasting, covering supply‑chain challenges, the evolution of time‑series prediction methods, a full data‑to‑model pipeline, change‑point detection, handling imbalanced data, multi‑model fusion, and specialized seasonal and long‑sequence forecasting techniques.
This article presents a sharing session titled "Sales Forecasting in Alibaba Health's Pharmaceutical E‑commerce", organized by DataFun, featuring guest speaker Wang Chaoyun, an algorithm engineer at Alibaba Health.
01 Business Background of Pharmaceutical E‑commerce Supply‑chain Forecasting
The supply‑chain faces severe uncertainty, especially the bull‑whip effect, where upstream fluctuations amplify downstream demand variations, increasing cost, inefficiency, and service level challenges.
Forecasting serves as the first line of defense, using data‑driven algorithms to predict future demand fluctuations, enabling better resource allocation, efficiency, and cost reduction.
Forecasting tasks are defined by metric granularity (e.g., GMV, sales volume), time granularity (long‑term, medium‑term, short‑term, real‑time), and spatial granularity (store, category, region, etc.). The end‑users of forecasts (optimization algorithms vs. business users) influence model priorities such as accuracy, stability, interpretability, and real‑time performance.
The overall task is a time‑series prediction: given historical data, predict a target metric over a future horizon.
02 Alibaba Health's Prediction Algorithm Solutions
2.1 Evolution of Time‑Series Prediction Techniques
Three major categories are discussed:
Statistical methods: stable and interpretable but low accuracy, single‑series, limited by lack of covariates.
Feature engineering + machine‑learning models: higher accuracy and efficiency, but require manual tuning and have weaker temporal representation.
Deep learning models (CNN, RNN, Attention): high accuracy and flexibility, yet often limited by data scale and stability in time‑series contexts.
2.2 Alibaba Health's End‑to‑End Forecasting Pipeline
The pipeline consists of five stages: raw data, data processing, feature engineering, model building, and prediction adjustment.
(1) Raw Data includes outbound logs, orders, product attributes, pricing, marketing activities, channel traffic, and weather.
(2) Data Processing covers smoothing sales, order splitting, attribute normalization, price validation, date alignment, and outlier handling.
(3) Feature Engineering creates four feature groups: sales features (anomaly detection, smoothing, statistics), product features (status, price, attributes, brand), category features (sales share, statistics, cross‑features), and marketing features (promotion type, price, timing, intensity, channel attributes).
(4) Model Building addresses two business scenarios (daily forecasting, promotion‑pre‑heat) and two product types (regular, long‑tail, new, seasonal). Models include conventional time‑series, machine‑learning, and deep‑learning (e.g., DeepAR). Prediction modes are point estimate, interval forecast, and probabilistic forecast, evaluated by WMAPE, MIN/MAX accuracy, and bias attribution.
(5) Prediction Adjustment handles short‑term trend capture, forecast validation, traffic trend integration, and post‑promotion long‑term adjustments.
2.3 Change‑Point Detection
When a product experiences sudden marketing or lifecycle changes, the data distribution shifts. The solution cuts sequences at detected change points during training and uses the nearest change point as a feature during inference. Detection relies on statistical property changes (mean, variance, correlation, spectral density) and involves loss function choices, search methods (optimal like PELT or approximate sliding window), and constraints (fixed or variable number of change points).
2.4 Handling Imbalanced Regression Data
Sales data are heavily skewed, making standard L1/L2 losses biased. Two families of solutions are presented:
Data‑based: Log transform, Box‑Cox, label distribution smoothing (LDS).
Model‑based: Poisson regression, Tweedie regression, feature distribution smoothing (FDS).
Key methods highlighted are Tweedie regression (captures zero‑inflated sales), LDS (weights samples via kernel‑smoothed label distribution), and FDS (smooths feature statistics with a calibration layer).
2.5 Multi‑Model Fusion via Time‑Series Clustering
Different scenarios require different models. The approach clusters time‑series based on extracted features, evaluates residuals per cluster, and selects or stacks models accordingly, as illustrated in the accompanying diagram.
03 Exploration of Specific Scenario Forecasts
3.1 Seasonal Forecasting
Some medicines exhibit clear seasonal patterns (e.g., herbal remedies in summer, cold medicines in winter). The goal is to capture these patterns early to reduce stock‑outs and over‑stock.
Key steps:
Determine seasonality using domain knowledge, time‑domain analysis, and time‑frequency methods (DFT + autocorrelation).
Detect period length via spectral analysis and autocorrelation testing.
Build models using STL decomposition and incorporate covariates such as temperature.
3.2 Long‑Sequence Forecasting (Seq2Seq)
For scenarios requiring extended horizons, three solutions are compared:
Seq2Seq + Attention: LSTM encoder‑decoder with attention weighting historical sales.
Informer: Probabilistic self‑attention with O(n log n) complexity, reducing input size and accelerating inference.
MedFac (in‑house): Combines LSTM for long‑term patterns, CNN for recent features, SENet for channel attention, and fuses Seq2Seq and CNN outputs.
The presentation concludes with a thank‑you and a link to the replay.
Images
DataFunSummit
Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.