CTR-Driven Advertising Image Generation Using Multimodal Large Language Models
The paper presents CAIG, a CTR‑driven advertising image generation pipeline that pre‑trains a multimodal LLM on e‑commerce data, trains a reward model on CTR‑labeled image pairs, and fine‑tunes generation via product‑centric preference optimization, achieving state‑of‑the‑art online and offline performance.