Cloud Computing 15 min read

Dynamic Compute Allocation for Alibaba Advertising Engine: Design, Implementation, and Experiments

The paper describes a green‑computing‑focused system for Alibaba Mama’s display ad engine that dynamically reallocates CPU cores among services by offline analysis of response‑time, concurrency and RPM data, enabling up to 3 % revenue gains while reducing operational cost.

Alimama Tech

Oct 12, 2022

Dynamic Compute Allocation for Alibaba Advertising Engine: Design, Implementation, and Experiments

This article presents a systematic study of dynamic compute (算力) allocation in Alibaba Mama’s display advertising engine. Under the background of green computing, the goal is to improve efficiency and business revenue by intelligently distributing CPU cores among services.

New Exploration : Starting from a fixed response‑time (RT) allocation, the authors investigate how to re‑allocate machine resources while keeping total resources constant, aiming to increase RPM (revenue per mille) and reduce operational cost.

Core Idea : For a given level (service tier), collect RPM data under different machine‑resource combinations, select the combination with the highest RPM, and use it as the target allocation. The approach relies on offline log analysis (RT, concurrency, level) to infer the relationship between machine resources and RPM.

Key Challenges : 1. Obtaining the resource‑to‑RPM relationship for each service without costly online scaling experiments. 2. Handling data sparsity when multiple services are combined, as the joint level space grows exponentially.

Solutions : - Approximate CPU‑core ↔ level mapping using observed RT and concurrency. - Estimate multi‑service RPM by combining pairwise level‑RPM data with conditional probability formulas, correcting with limited joint measurements. - Build an offline data pipeline to generate resource‑level‑RPM tables.

Design & Implementation : A framework (ASController) consumes the offline tables and operational requirements (e.g., target CPU reduction) to compute optimal machine‑resource combinations. Ops APIs are used to query and update service quotas, enabling both manual and fully automated scaling.

Experiments : 1. Single‑service level experiment (App1 at level 50) showed a +0.73 % RPM improvement, matching theoretical predictions. 2. Multi‑service level experiment (App1 80 / App2 100) achieved a +3 % RPM gain, close to the expected +3 %.

Conclusion & Outlook : The proposed method reliably approximates optimal machine allocation with low cost, has been deployed to core services (ranking, retrieval, policy), and will be extended to more services and finer‑grained simulations to further improve stability and efficiency.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Performance Optimization resource allocation advertising engine dynamic compute machine scaling

Written by

Alimama Tech

Official Alimama tech channel, showcasing all of Alimama's technical innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.