Intelligent Copy Generation for Taobao Push: Design, Implementation, and Evaluation
The 2021 Taobao Push project introduced an AI‑driven copy‑generation platform that combines template extraction and fine‑tuned Unilm models with diverse beam search, creating diverse, high‑quality push messages, cutting manual costs, and delivering a 10 % click‑through lift and higher material adoption.
This document presents the work carried out in 2021 on the intelligent copy‑generation project for Taobao Push.
Business background: Push notifications are a key activation channel on Taobao. Existing copy is homogeneous, manually created, and costly, leading to low diversity and sub‑optimal click‑through rates.
Problem definition & goals: Build an AI‑driven system that can automatically generate and recommend high‑quality copy, increase diversity, reduce creation cost, and improve Push click‑through.
System architecture (V1.0): The platform provides a unified TPP service with two generation modes – template‑based generation and keyword‑based generation. The template module extracts attribute placeholders (e.g., ${style}, ${audience}) from historical copy; the keyword module fine‑tunes a large‑scale pretrained model (Unilm) on Push data.
Key algorithms: Unilm was chosen over BERT and GPT for better generation quality. Tricks include combining character‑ and word‑level inputs to reduce exposure bias and improve fluency.
Evaluation metrics: Relevance (BLEU‑like), fluency (human rating), and novelty (n‑gram distinctness). Experiments show distinct‑1/2/3‑gram improvements across model variants.
Optimization & iteration: Implemented Diverse Beam Search with a diversity penalty and post‑generation de‑duplication using SimHash. Batch beam search reduced offline decoding time from 3 s to 2.3 s and enabled generation of >225 copies per request.
V2.0 – Push Copy Layer: Integrated the generation module into the Push recommendation pipeline (recall → coarse ranking → fine ranking → copy layer). Built a large material library from multiple sources (human‑written and model‑generated) and introduced a copy‑ranking model that predicts click probability using product, copy, and user features.
Results: Online A/B tests show a 10.16 % lift in Push click‑through, a 17 % increase in average copy count per material, and a rise in adoption rate from 85 % to over 90 %.
Conclusion: The intelligent copy system successfully addresses copy diversity and cost issues, establishes a unified copy repository, and improves Push performance. Future work includes online reinforcement learning, real‑time sensitive‑word monitoring, and better decoding control.
DaTaobao Tech
Official account of DaTaobao Technology
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.