Artificial Intelligence 10 min read

Machine Learning‑Based Time Series Forecasting and Anomaly Detection System at JD Search

The article describes JD Search's machine‑learning alert system that combines offline and real‑time training, FFT‑based periodic detection, Prophet forecasting, and DBSCAN anomaly clustering, and explains architectural design, data preprocessing, model optimization, and distributed deployment to improve alert accuracy and response speed.

DataFunSummit
DataFunSummit
DataFunSummit
Machine Learning‑Based Time Series Forecasting and Anomaly Detection System at JD Search

Traditional rule‑based alerts that rely on static thresholds often miss patterned fluctuations and generate high false‑positive rates; to address this, JD Search's data science team built a machine‑learning alert system that uses time‑series forecasting and anomaly detection to improve accuracy, timeliness, and causal explainability.

The overall architecture includes offline training tasks that load data from HDFS, perform feature engineering, and store model information and parameters in a parameter server, as well as real‑time training tasks that ingest samples from Kafka, pull the latest parameters, optionally predict, and push updated models back to the server.

Key components of the pipeline are:

FFT – Periodicity Detection: Fourier Transform converts time‑domain signals to the frequency domain to identify weekly cycles, with moving‑average filtering and linear regression used to remove spikes and long‑term trends; abnormal segments are replaced by a learned pattern.

Prophet – Time‑Series Forecasting: Facebook’s Prophet model learns long‑term trends, seasonality, and holidays, handling missing values and supporting multi‑step forecasts; hyper‑parameters such as changepoint_prior_scale and seasonality_prior_scale are tuned offline.

DBSCAN – Anomaly Point Detection: Density‑based clustering on original values and Prophet residuals identifies outliers (label = ‑1) while handling arbitrary cluster shapes.

Optimization steps include using Alink to invoke Python methods in a distributed fashion and switching from Flink’s keyBy to rebalance for load balancing, ensuring the end‑to‑end latency stays under five minutes.

The system also incorporates causal inference ideas to attribute detected anomalies to root causes across multiple data streams (clicks, adds‑to‑cart, exposures), enabling faster fault isolation.

Overall, the solution demonstrates a practical integration of AI techniques—FFT, Prophet, DBSCAN—and distributed engineering to build a robust, low‑latency anomaly detection platform for large‑scale e‑commerce search data.

machine learningAnomaly Detectiondistributed computingtime seriesDBSCANProphetFFT
DataFunSummit
Written by

DataFunSummit

Official account of the DataFun community, dedicated to sharing big data and AI industry summit news and speaker talks, with regular downloadable resource packs.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.