How Baidu Feed Achieved Serverless Scaling with Multi‑Dimensional Service Profiles
This article explains how Baidu's Feed recommendation backend adopted a serverless approach, building elastic, traffic, and capacity portraits for each micro‑service to enable predictive, load‑feedback, and timed scaling, thereby reducing resource waste and operational costs in a cloud‑native environment.
Background
In Baidu's cloud‑native environment, the Feed recommendation service consists of many micro‑services that are compute‑heavy, run 24/7 and have fixed capacity, leading to resource waste during traffic fluctuations.
Goal
Build multi‑dimensional, personalized service portraits (elastic, traffic, capacity) and use them for automatic elastic scaling to reduce cost.
Elastic Portrait Construction
Services are classified into high, medium and low elasticity based on instance deployment time, resource quota, statefulness and external dependencies.
High elasticity : stateless, fast scaling.
Medium elasticity : some state, moderate scaling.
Low elasticity : stateful, costly scaling.
Improvements include standard container migration and separating storage from compute.
Traffic Portrait
Traffic is modeled using CPU usage as a proxy for QPS, divided into configurable time‑slices (e.g., hourly). Historical CPU data are smoothed and the maximum K windows per slice are used for prediction.
Capacity Portrait
Peak CPU utilization defines the required CPU buffer; machine‑learning models map QPS and resource usage to latency to determine safe capacity limits.
Elastic Strategies
Three strategies are applied:
Predictive elasticity : forecast traffic for the next slice and pre‑scale.
Load‑feedback elasticity : adjust instances in near‑real‑time based on current load.
Timed elasticity : expand before known peak periods and shrink afterwards.
Priorities: timed > predictive > load‑feedback. Load‑feedback only expands; shrinking is handled by the other two.
Stability Assurance
Periodic inspections (elastic, capacity, status) and one‑click interventions ensure service reliability during rapid scaling.
Serverless has been deployed to Baidu Feed with over 100 k service instances, significantly lowering operating costs and will continue to focus on hotspot capacity guarantees and ML‑enhanced traffic prediction.
Architecture & Thinking
🍭 Frontline tech director and chief architect at top-tier companies 🥝 Years of deep experience in internet, e‑commerce, social, and finance sectors 🌾 Committed to publishing high‑quality articles covering core technologies of leading internet firms, application architecture, and AI breakthroughs.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.