Artificial Intelligence 10 min read

How AIGC Boosts Ad Creative Quality: Trustworthy Image Generation & Selection

2024 saw the advertising team achieve major breakthroughs in AI-generated ad creatives by introducing a multimodal reliable feedback network to improve image usability, releasing a large human-annotated dataset, and leveraging multimodal large language models for richer representation and more effective online/offline creative selection.

JD Cloud Developers
JD Cloud Developers
JD Cloud Developers
How AIGC Boosts Ad Creative Quality: Trustworthy Image Generation & Selection

1. Introduction

High-quality ad creatives enhance information delivery, click-through and conversion rates. In 2023 the ad team used AIGC to increase creative diversity, but low-quality assets limited coverage. In 2024 they achieved breakthroughs in creative generation and selection, enabling automatic high-quality ad creation and personalized recommendation.

2. Trustworthy Creative Generation

Generating attractive ad images is costly when done manually. Existing diffusion models often produce unusable images, requiring extensive human review. To address this, the team proposed a Reliable Feedback Network (RFNet) that simulates human auditors by integrating multiple auxiliary modalities.

The RFNet evaluates image usability and feeds back to the diffusion model, improving the proportion of usable images while preserving visual appeal.

Training uses a one-hot vector y_d for ground‑truth class, the model output o_i, and back‑propagates gradients only through the ControlNet part, keeping Stable Diffusion frozen.

3. Offline Representation Construction and Integration

Using Multimodal Large Language Model (MLLM) techniques, the team extracts explicit and implicit features from creative images and copy, enriching the representation system and aligning it with the selection model to improve discrimination and cold‑start performance.

Explicit features: NER, background color, faces/brand logos.

Implicit features: promotion status, target user group.

Contrastive learning (MoCo v3) provides multi‑modal representations, using other creatives of the same SKU as negative samples to increase inter‑creative distinction.

4. Online Selection Architecture Optimization

The online ranking originally ignored interactions among candidate creatives, leading to incomplete information and a mismatch between offline CTR estimation and online list‑wise ranking. The upgraded model introduces candidate‑creative feature ingestion and a list‑wise objective that jointly optimizes exposure and click prediction.

A joint training paradigm splits the problem into a estimation and a subsequent creative ranking, reducing online serving pressure while handling combinatorial explosion.

5. Summary and Outlook

The proposed solutions address bad cases in generated ad assets and the matching of massive creatives to users. Future work will focus on better multimodal fusion and personalized creative generation for diverse user groups.

machine learningmultimodalimage generationAIGCad optimization
JD Cloud Developers
Written by

JD Cloud Developers

JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.