Baidu Commercial Advertising Platform Multi-Agent Architecture: From Theory to Practice
Baidu’s Commercial Advertising Platform employs a multi‑agent architecture that combines large‑ and small‑model collaboration, SOP‑driven task decomposition, and long‑term memory to transform natural‑language queries into precise, personalized advertising actions, achieving up to 98.5% parsing accuracy and cutting analysis time from over 30 minutes to under one minute.
In the AI Native era, advertising marketing platforms have undergone fundamental transformations. This article by Baidu's Commercial Advertising Platform R&D team explores how AI agents serve as the primary carrier for delivering commercial value to customers, with generative AI as the core technology enabling users to "speak freely, use simply, and get everything done."
Core Capabilities of AI Agents: The agent system possesses four essential capabilities: 1) Understanding natural language queries and accurately extracting slot information; 2) Active planning through large models combining long-term memory and domain knowledge; 3) Strong execution capabilities integrating numerous business systems; 4) Personalized responses in natural language.
Technical Challenges: The implementation faces four major challenges: LLM hallucinations causing inconsistent results; low multi-step reasoning accuracy with high latency; difficulty in understanding complex business system APIs (5,000+ APIs, 360+ data tables, 5,000+ fields); and translating structured business outputs back to natural language.
Multi-Agent Architecture: Baidu developed a three-core technology approach: 1) Large-small model collaboration where queries leverage long-term memory for efficiency; 2) Domain SOP-based multi-agent collaboration solving complex problems through task decomposition; 3) Long-term memory and self-learning strategies for continuous improvement. The five-layer architecture includes: Application Layer (SOP-assembled Vertical Agents), Agent Layer (Framework infrastructure + Vertical Agents + Multi-Agent collaboration), Model Layer (large/small models + tools), Memory Layer (BaikalDB for vector and long-term storage), and Data Tools Layer (evaluation, testing, annotation tools).
Application Cases: Light Vessel GBI Agent enables natural language advertising data analysis with complex calculations. JarvisBot achieves automated fault diagnosis through multi-AI agent collaboration, reducing single anomaly localization time from 30+ minutes to under 1 minute.
Performance Results: Multi-slot instruction parsing accuracy reached 98.5% with response time of 1.5s (95th percentile: 3.3s). LUI recognition accuracy improved from 85% to 96%, and regression environment setup time reduced from 7 person-days to under 1 hour.
Baidu Geek Talk
Follow us to discover more Baidu tech insights.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.