HetMatch: Heterogeneous Graph Neural Network for Keyword Recommendation in Search Advertising
HetMatch is a heterogeneous graph neural network for keyword recommendation in search advertising that tackles cold‑start and large‑scale challenges by hierarchically fusing node and subgraph features, denoising graph convolutions, applying self‑attention, twin matching, and multi‑view learning, delivering notable recall gains and online performance improvements for Alibaba’s advertising tools.
Abstract Recent years have seen a surge of interest in online advertising optimization. In search advertising, keyword recommendation is a core service for advertisers. This article introduces HetMatch, a heterogeneous‑graph‑learning based keyword recall model developed by Alibaba’s Customer Growth team, presented at CIKM 2021. The model addresses challenges in large‑scale keyword recommendation and has been deployed across multiple Alibaba advertising tools.
Background Search ads rely on advertisers bidding on keywords to obtain exposure. Millions of advertisers manually add tens of millions of keywords daily, yet many lack expertise to select effective keywords, leading to low exposure rates (less than 10% of self‑selected keywords receive impressions the next day). Existing recall methods based on text matching, collaborative filtering, or topic clustering ignore rich heterogeneous behaviors and suffer from cold‑start issues.
Problem Definition A heterogeneous information network (HIN) is constructed from ads (ad), items (item), and queries (query), with nodes and multiple relation types (click, coclick, etc.). The goal is to maximize the overall top‑K recall rate for each ad while restricting recalled keywords to a candidate set sharing the same predicted category.
Method HetMatch follows a hierarchical information‑fusion pipeline:
1. Node‑level feature fusion : Encode discrete and continuous features of each node into fixed‑dimensional vectors, converting continuous features via quantile binning and applying type‑specific neural networks to obtain node embeddings.
2. Subgraph‑level feature fusion : Define two groups of metapaths—one based on purchase relationships and another on item‑bridge relationships—to capture high‑order semantic neighbors. Metapaths model competition among ads for the same keyword and co‑click patterns between ads and items.
3. Denoising graph convolution : Extend GraphSAGE with an auto‑encoder‑based aggregation that compresses neighbor information, reducing noise from random clicks and under‑trained node representations. Top‑K sampling based on actual click behavior further mitigates noisy edges.
4. Semantic fusion layer : Aggregate embeddings from different metapaths using a self‑attention layer (as in HAN) to produce a unified representation.
5. Twin matching : Transform the ad‑keyword matching problem into a meta‑node matching problem, aligning ad embeddings with the mean embedding of its top‑K related keywords (and vice‑versa) to ensure homogeneous representation spaces.
6. Multi‑view learning : Incorporate multiple ad‑keyword relation views (click, purchase, item‑bridge) with separate view‑specific networks for the ad side while sharing a single keyword embedding. Optimization uses sampled softmax loss to maximize positive pair scores and minimize irrelevant pairs.
Offline Experiments The HetMatch model was evaluated on a large‑scale search‑ad production dataset against baselines such as term‑match, DSSM, HAN, and IntentGC. HetMatch consistently improved recall across various recall depths and showed notable gains in cold‑start scenarios. Ablation studies confirmed the contribution of each module.
Online A/B tests on Alibaba’s “直通车” keyword recommendation tools demonstrated a 4.19% increase in adoption rate for the keyword suggestion tool, a 5.35% rise in click volume for adopted keywords, and a 10.89% boost in spend for the smart‑buy tool compared to the previous GraphSAGE‑based model.
Conclusion HetMatch leverages a massive heterogeneous graph of ads, items, and queries to enhance keyword recall. Future work includes scaling to even larger graphs with richer node types, and integrating transformer‑based language models with GNNs to improve textual modeling.
References
[1] Wu, S., Sun, F., Zhang, W., & Cui, B. “Graph neural networks in recommender systems: a survey.” arXiv preprint arXiv:2011.02260 (2020).
[2] Xu, J., Yang, Y., Wang, C., Liu, Z., Zhang, J., Chen, L., & Lu, J. “Robust Network Enhancement from Flawed Networks.” IEEE Transactions on Knowledge and Data Engineering (2020).
Alimama Tech
Official Alimama tech channel, showcasing all of Alimama's technical innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.