Tag

auto scaling

2 views collected around this technical thread.

Baidu Tech Salon
Baidu Tech Salon
Jun 17, 2025 · Operations

How Baidu Scaled Its Vertical Search: Elastic Scheduling and Data Management Secrets

This article explains how Baidu's vertical search platform tackled massive data growth and scaling challenges by redesigning its data management system, introducing elastic scheduling, decoupling ETCD access, implementing auto‑scaling, and advancing shard expansion to improve performance, stability, and cost efficiency.

ETCDShardingauto scaling
0 likes · 18 min read
How Baidu Scaled Its Vertical Search: Elastic Scheduling and Data Management Secrets
Qunar Tech Salon
Qunar Tech Salon
Mar 27, 2025 · Operations

Automated Capacity Planning and Auto‑Scaling for Hotel Services During Traffic Peaks

This document describes a comprehensive capacity‑planning solution that predicts traffic‑peak impacts for hotel services, automatically estimates required CPU resources, creates timed scaling tasks, and evaluates performance using detailed metrics, thereby improving operational efficiency and reducing manual effort during events such as exam‑ticket printing and holiday travel surges.

Cloud Computingalgorithmauto scaling
0 likes · 12 min read
Automated Capacity Planning and Auto‑Scaling for Hotel Services During Traffic Peaks
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Feb 10, 2025 · Artificial Intelligence

Hybrid Cloud Elastic LLM Inference Solution with ACK Edge and KServe

This article presents a hybrid‑cloud solution that uses ACK Edge and KServe to dynamically allocate on‑premise and cloud GPU resources for large‑language‑model inference, addressing tidal traffic patterns, reducing costs, and ensuring high availability through elastic scaling and custom scheduling policies.

ACK EdgeElastic InferenceKServe
0 likes · 13 min read
Hybrid Cloud Elastic LLM Inference Solution with ACK Edge and KServe
IT Architects Alliance
IT Architects Alliance
Jan 7, 2025 · Cloud Computing

Elastic Architecture: Auto Scaling and Failover for Resilient Systems

The article explains how elastic architecture, through auto‑scaling and failover mechanisms, dynamically adjusts resources and ensures continuous service during traffic spikes and component failures, improving cost efficiency, reliability, and operational stability for modern cloud‑based applications.

Cloud ComputingElastic ArchitectureFailover
0 likes · 16 min read
Elastic Architecture: Auto Scaling and Failover for Resilient Systems
HelloTech
HelloTech
Aug 1, 2023 · Cloud Native

Elastic Scaling Practices in Cloud‑Native Kubernetes Environments

To overcome native HPA limits and business‑specific constraints in a fully containerized, cloud‑native Kubernetes environment, we implemented a dual‑threshold water‑level and scheduled scaling engine, hybrid‑cloud ClusterAutoScale, mixed‑deployment resource prioritization, and comprehensive Prometheus‑based observability, achieving higher utilization, lower costs, and a roadmap toward deeper optimization and AIOps.

Kubernetesauto scalingcloud native
0 likes · 10 min read
Elastic Scaling Practices in Cloud‑Native Kubernetes Environments
DaTaobao Tech
DaTaobao Tech
Jul 5, 2023 · Cloud Native

Cloud‑Native Multi‑Tenant Architecture and Network Isolation in Taobao Open Platform

The Taobao Open Platform adopts a cloud‑native, multi‑tenant architecture that abstracts infrastructure, isolates tenants via independent or shared switch‑plus‑security‑group schemes with dual ENI pod networking, and leverages Kubernetes auto‑scaling to simplify onboarding, cut operational costs, and enable future low‑code and FaaS extensions.

Kubernetesauto scalingcloud native
0 likes · 14 min read
Cloud‑Native Multi‑Tenant Architecture and Network Isolation in Taobao Open Platform
ByteDance Cloud Native
ByteDance Cloud Native
Jun 1, 2023 · Cloud Native

How to Deploy and Scale ByConity’s Cloud‑Native Data Warehouse on Kubernetes

ByConity is a cloud‑native, storage‑compute separated data warehouse engine that supports multi‑tenant isolation, high performance, and elastic scaling; this guide explains its three‑layer architecture, hardware requirements, Helm‑based Kubernetes deployment, dynamic scaling, and practical SQL testing steps.

ByConityHelmKubernetes
0 likes · 11 min read
How to Deploy and Scale ByConity’s Cloud‑Native Data Warehouse on Kubernetes
Efficient Ops
Efficient Ops
Mar 21, 2023 · Operations

How Hupu Scaled to Millions: Inside the Flex Auto‑Scaling Platform

This article details Hupu's massive sports‑traffic environment, the design and implementation of the Flex auto‑scaling platform, its architecture, core functions such as resource statistics, node and pod scaling, scenario scheduling, and the performance optimizations that enable rapid, cost‑effective scaling across multi‑cloud Kubernetes clusters.

Cloud Native OperationsKubernetesMulti-Cloud
0 likes · 15 min read
How Hupu Scaled to Millions: Inside the Flex Auto‑Scaling Platform
Alimama Tech
Alimama Tech
Nov 2, 2022 · Artificial Intelligence

Optimizing GPU Utilization for Multimedia AI Services with high_service

The article presents high_service, a high‑performance inference framework that boosts GPU utilization in multimedia AI services by separating CPU‑heavy preprocessing from GPU inference, employing priority‑based auto‑scaling, multi‑tenant sharing, and TensorRT‑accelerated models to eliminate GIL bottlenecks, reduce waste, and adapt to fluctuating traffic, with future work targeting automated bottleneck detection and further CPU‑GPU offloading.

GPU utilizationHigh Performance ComputingTensorRT
0 likes · 19 min read
Optimizing GPU Utilization for Multimedia AI Services with high_service
Tencent Cloud Developer
Tencent Cloud Developer
Sep 29, 2022 · Cloud Native

Improving Kubernetes Cluster Utilization: Practices and Optimization Strategies

The session detailed how Tencent’s container experts boost Kubernetes cluster utilization by correcting pod resource requests, employing two‑level auto‑scaling, dynamic over‑commit, adaptive scheduling and eviction, and using HPA/EHPA/VPA, achieving up to 38.7% node usage and roughly 60% cost savings in real‑world workloads.

Cluster UtilizationKubernetesPod Scheduling
0 likes · 11 min read
Improving Kubernetes Cluster Utilization: Practices and Optimization Strategies
Tencent Cloud Developer
Tencent Cloud Developer
Jul 26, 2022 · Cloud Native

Understanding Knative: A Cloud-Native Serverless Framework

Knative is a CNCF‑incubated, cloud‑native serverless framework on Kubernetes that combines Build, Eventing, and Serving components—featuring a Knative Pod Autoscaler that can scale pods to zero—offering improved resource utilization, rapid traffic response, and developer productivity despite modest performance overhead.

CNCFFunction ComputingKnative
0 likes · 16 min read
Understanding Knative: A Cloud-Native Serverless Framework
360 Smart Cloud
360 Smart Cloud
Jul 14, 2022 · Cloud Computing

Auto Scaling (AS) in Cloud Services: Architecture, Use Cases, and Optimization Strategies

This article explains the concept of elastic auto scaling in cloud services, describes typical scenarios such as high‑elastic web apps and compute‑intensive workloads, details the four‑layer architecture and workflow, and outlines functional features, stability improvements, and future optimization directions.

Cloud ComputingHigh AvailabilityLoad Balancing
0 likes · 12 min read
Auto Scaling (AS) in Cloud Services: Architecture, Use Cases, and Optimization Strategies
DataFunSummit
DataFunSummit
Jul 1, 2022 · Big Data

Exploring and Implementing Elastic Scheduling for Xiaomi Hadoop YARN

Shilong Fei from Xiaomi Data Platform presents an in‑depth exploration of elastic scheduling for Hadoop YARN, covering background, design of resource pools, auto‑scaling architecture, challenges such as job stability and user transparency, achieved cost reductions, and future plans for further optimization.

HadoopYARNauto scaling
0 likes · 20 min read
Exploring and Implementing Elastic Scheduling for Xiaomi Hadoop YARN
Shopee Tech Team
Shopee Tech Team
May 26, 2022 · Cloud Computing

Shopee's Green Computing Practices: Optimizing Resource Utilization in Data Centers

Shopee reduces data‑center carbon emissions by over 40,000 tons annually through three 2021 green‑computing technologies—Overcommit resource oversubscription, mixed‑model Colocation of latency‑sensitive and batch workloads, and enhanced Auto Scaling that leverages global metrics to cut machine usage and improve resource efficiency.

Cloud ComputingGreen computingKubernetes
0 likes · 15 min read
Shopee's Green Computing Practices: Optimizing Resource Utilization in Data Centers
HomeTech
HomeTech
Dec 7, 2021 · Big Data

Flink Task Auto-scaling Design and Implementation

This article presents the design and implementation of Flink task auto‑scaling, covering background, manual and automatic scaling mechanisms, architecture with RescaleCoordinator, persistence via Zookeeper and HDFS, scaling policies for parallelism, CPU and memory, and future plans for fine‑grained and time‑based resource adjustments.

HDFSZookeeperauto scaling
0 likes · 4 min read
Flink Task Auto-scaling Design and Implementation
Liulishuo Tech Team
Liulishuo Tech Team
Oct 29, 2021 · Cloud Computing

Automating Cloud Infrastructure at Liulishuo: Deployment, Management, and Governance Practices

The article describes Liulishuo's Cloud Infra team's end‑to‑end automation of cloud resource provisioning, scaling, and cost governance using Terraform, a custom Luban platform, GitLab CI/CD, and chat‑bot integrations, highlighting the architectural design, implementation steps, and measurable benefits for both operations and business teams.

Infrastructure as CodeTerraformauto scaling
0 likes · 10 min read
Automating Cloud Infrastructure at Liulishuo: Deployment, Management, and Governance Practices
Tencent Cloud Developer
Tencent Cloud Developer
Jul 7, 2021 · Cloud Native

Design and Practice of Tencent Cloud Native Database TDSQL-C Serverless Architecture

TDSQL‑C Serverless separates compute and storage, delivers instant elastic scaling for MySQL and PostgreSQL, charges per‑second usage, pauses and stops billing when idle, and supports low‑frequency, archival, development, and micro‑service workloads with a ~2‑second cold‑start.

ServerlessTDSQL-CUsage Billing
0 likes · 13 min read
Design and Practice of Tencent Cloud Native Database TDSQL-C Serverless Architecture
IT Architects Alliance
IT Architects Alliance
Jun 10, 2021 · Cloud Native

Designing High‑Availability Stateless Services: Load Balancing, Scaling, and Deployment Strategies

This article explains how to achieve high availability for stateless services by employing redundancy, vertical and horizontal scaling, various load‑balancing algorithms (random, round‑robin, weighted, least‑connections, source‑hash), and automatic scaling techniques in cloud‑native environments, while also covering performance monitoring and CDN/OSS usage.

High AvailabilityLoad Balancingauto scaling
0 likes · 10 min read
Designing High‑Availability Stateless Services: Load Balancing, Scaling, and Deployment Strategies
Efficient Ops
Efficient Ops
Apr 20, 2021 · Operations

How Dada’s Intelligent Elastic Scaling Cuts Costs and Boosts Delivery Performance

This article details Dada Group’s implementation of an intelligent elastic scaling architecture that automatically adjusts capacity during peak promotions and low‑traffic periods, improving delivery reliability, reducing costs, and supporting multi‑cloud and multi‑runtime environments through sophisticated monitoring and auto‑scaler mechanisms.

Monitoringauto scalingcapacity-management
0 likes · 17 min read
How Dada’s Intelligent Elastic Scaling Cuts Costs and Boosts Delivery Performance