Artificial Intelligence 11 min read

Kalman Filtering Attention for User Behavior Modeling in CTR Prediction

This article introduces a Kalman Filtering Attention (KFAtt) framework that enhances click‑through‑rate (CTR) prediction by modeling user behavior with a Kalman‑filter‑based attention mechanism and a frequency‑capped variant, addressing new‑interest coverage and frequency bias in e‑commerce scenarios.

JD Retail Technology
JD Retail Technology
JD Retail Technology
Kalman Filtering Attention for User Behavior Modeling in CTR Prediction

Task Background Click‑through‑rate (CTR) prediction is a core problem in advertising, directly affecting user experience and advertiser revenue. In e‑commerce, abundant user actions (browsing, clicking, purchasing) provide rich signals, but traditional attention‑based behavior models struggle with two issues: (1) they assume current user interest is covered by historical actions, which fails when users show interest in new items, and (2) they treat all behaviors equally, causing high‑frequency actions to dominate the attention weights.

Existing Methods Most user‑behavior modeling modules use classic attention mechanisms (e.g., DIN, DIEN, Transformer) that compute a weighted sum of historical behavior embeddings. These methods assign larger weights to behaviors related to the current query but ignore the problems of uncovered new interests and frequency imbalance.

Our Algorithm Principle We propose Kalman Filtering Attention (KFAtt), which treats each historical behavior as an independent sensor observing the user's latent interest for the current search term. Assuming the interest follows a Gaussian prior, we apply Kalman‑filter equations to fuse the prior with noisy observations, yielding a posterior estimate that incorporates global prior information and accounts for sensor (behavior) uncertainty. When the uncertainty term is omitted, KFAtt reduces to traditional attention, demonstrating compatibility.

Frequency‑Capped Variant (KFAtt‑freq) To handle severe frequency imbalance across product categories, we de‑duplicate historical search terms and aggregate behaviors per category, treating repeated observations of the same category as multiple measurements from the same sensor. By separating system error (distance between sensor and target) from measurement error, the MAP solution imposes an upper bound on the total contribution of high‑frequency behaviors, mitigating bias.

Experimental Results We evaluated KFAtt and KFAtt‑freq on the Amazon product recommendation dataset, including two challenging subsets: New (queries from unseen categories) and Infreq (extremely low‑frequency categories). Both variants achieved higher AUC than state‑of‑the‑art baselines (DIN, DIEN, Transformer) across all test sets, with especially large gains on New and Infreq. Additional experiments on JD.com’s large‑scale search traffic confirmed significant online CTR improvements and comparable inference latency to the most efficient baselines.

Conclusion The Kalman Filtering Attention framework effectively incorporates global prior knowledge and controls frequency bias, leading to more accurate and unbiased user interest extraction and consequently better CTR prediction in industrial advertising systems.

Machine LearningCTR predictionuser behavior modelingattention mechanismKalman filter
JD Retail Technology
Written by

JD Retail Technology

Official platform of JD Retail Technology, delivering insightful R&D news and a deep look into the lives and work of technologists.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.