Artificial Intelligence 16 min read

Deep Spatio‑Temporal Neural Networks and Memory‑Augmented DNN for Click‑Through Rate Prediction

This article presents the design, challenges, and experimental evaluation of DSTN (with pooling, self‑attention, and interactive‑attention variants) and MA‑DNN models for CTR prediction, highlighting how temporal and contextual ad information improves accuracy and yields significant online gains in large‑scale advertising systems.

DataFunTalk
DataFunTalk
DataFunTalk
Deep Spatio‑Temporal Neural Networks and Memory‑Augmented DNN for Click‑Through Rate Prediction

Background : Click‑through rate (CTR) estimation is a core technology for cost‑per‑click (CPC) and optimized CPC (OCPC) advertising; accurate predictions reduce acquisition costs for advertisers and increase platform revenue. Traditional models consider only the target ad, ignoring temporal and spatial context.

Problem Formulation : CTR prediction is modeled as estimating the probability that a user clicks an ad given user, query, ad, and context features. Incorporating auxiliary ads—previous ads (clicked and unclicked) and contextual ads—can provide valuable signals.

Proposed Models :

DSTN‑P (Pooling) : Embeds target and each type of auxiliary ad, aggregates each auxiliary set with sum‑pooling, and projects all embeddings into a unified semantic space before a feed‑forward network.

DSTN‑S (Self‑Attention) : Replaces sum‑pooling with a self‑attention module to weight auxiliary ads within the same type, improving signal extraction.

DSTN‑I (Interactive‑Attention) : Computes attention between each auxiliary ad and the target ad, avoiding softmax normalization across unrelated ads and dynamically selecting useful information.

MA‑DNN (Memory‑Augmented DNN) : Introduces two per‑user memory vectors (interested m u1 and uninterested m u0 ) that are updated via a combined cross‑entropy and mean‑square loss, allowing the model to capture long‑term user preferences with DNN‑level efficiency.

Model Details : All models embed single‑value, multi‑value, and numeric features (direct embedding, sum‑pooling, or discretized embedding). The fused embedding vector is fed into multi‑layer perceptrons with ReLU activations; the final sigmoid outputs the CTR estimate.

Experimental Setup : Offline experiments on Avito, Search, and Feed datasets compare DSTN variants, MA‑DNN, and baseline models (LR, FM, DNN, Wide&Deep, DeepFM, CRF, GRU). Metrics include AUC and LogLoss. Online A/B tests evaluate real‑world impact.

Results :

All DSTN variants outperform baseline models; DSTN‑I achieves the highest gains by effectively leveraging auxiliary ads.

MA‑DNN matches DSTN performance with lower inference cost, achieving a +2.5% CTR lift online.

Analysis shows that contextual ads, clicked ads, and unclicked ads contribute differently across datasets; interactive attention best distinguishes useful from noisy signals.

System Architecture : Offline training produces models served by a Model Server; a real‑time session (RTS) component streams user behavior to the server for online inference.

Conclusions : Incorporating temporal and spatial auxiliary ad information via DSTN or memory‑augmented DNN significantly improves CTR prediction accuracy. DSTN offers strong performance with higher complexity, while MA‑DNN provides a balanced trade‑off between effectiveness and deployment efficiency.

advertisingmachine learningDeep LearningCTR predictionKDDMemory Networkspatio-temporal network
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.