Cloud Computing 15 min read

Dynamic Compute Allocation for Alibaba Advertising Engine: Design, Implementation, and Experiments

The paper describes a green‑computing‑focused system for Alibaba Mama’s display ad engine that dynamically reallocates CPU cores among services by offline analysis of response‑time, concurrency and RPM data, enabling up to 3 % revenue gains while reducing operational cost.

Alimama Tech
Alimama Tech
Alimama Tech
Dynamic Compute Allocation for Alibaba Advertising Engine: Design, Implementation, and Experiments

This article presents a systematic study of dynamic compute (算力) allocation in Alibaba Mama’s display advertising engine. Under the background of green computing, the goal is to improve efficiency and business revenue by intelligently distributing CPU cores among services.

New Exploration : Starting from a fixed response‑time (RT) allocation, the authors investigate how to re‑allocate machine resources while keeping total resources constant, aiming to increase RPM (revenue per mille) and reduce operational cost.

Core Idea : For a given level (service tier), collect RPM data under different machine‑resource combinations, select the combination with the highest RPM, and use it as the target allocation. The approach relies on offline log analysis (RT, concurrency, level) to infer the relationship between machine resources and RPM.

Key Challenges : 1. Obtaining the resource‑to‑RPM relationship for each service without costly online scaling experiments. 2. Handling data sparsity when multiple services are combined, as the joint level space grows exponentially.

Solutions : - Approximate CPU‑core ↔ level mapping using observed RT and concurrency. - Estimate multi‑service RPM by combining pairwise level‑RPM data with conditional probability formulas, correcting with limited joint measurements. - Build an offline data pipeline to generate resource‑level‑RPM tables.

Design & Implementation : A framework (ASController) consumes the offline tables and operational requirements (e.g., target CPU reduction) to compute optimal machine‑resource combinations. Ops APIs are used to query and update service quotas, enabling both manual and fully automated scaling.

Experiments : 1. Single‑service level experiment (App1 at level 50) showed a +0.73 % RPM improvement, matching theoretical predictions. 2. Multi‑service level experiment (App1 80 / App2 100) achieved a +3 % RPM gain, close to the expected +3 %.

Conclusion & Outlook : The proposed method reliably approximates optimal machine allocation with low cost, has been deployed to core services (ranking, retrieval, policy), and will be extended to more services and finer‑grained simulations to further improve stability and efficiency.

performance optimizationresource allocationadvertising enginedynamic computemachine scaling
Alimama Tech
Written by

Alimama Tech

Official Alimama tech channel, showcasing all of Alimama's technical innovations.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.