Tag

autoscaling

0 views collected around this technical thread.

Linux Ops Smart Journey
Linux Ops Smart Journey
Oct 11, 2024 · Cloud Native

Master Kubernetes HPA: Auto-Scale Pods Efficiently with Real-World Examples

This guide explains what Kubernetes Horizontal Pod Autoscaler (HPA) is, how it works, its key features, and provides step‑by‑step configuration, verification, and scaling policy details with practical code examples for cloud‑native applications.

DevOpsHPAK8s
0 likes · 10 min read
Master Kubernetes HPA: Auto-Scale Pods Efficiently with Real-World Examples
Efficient Ops
Efficient Ops
Oct 9, 2024 · Cloud Computing

How One Engineer Runs a Full SaaS on Kubernetes with Minimal Effort

This article details how a solo engineer built and operated a SaaS platform on AWS using Kubernetes, covering infrastructure overview, automatic DNS, TLS, load balancing, CI/CD rollouts, autoscaling, caching, secret management, monitoring, logging, error tracking, and cost‑effective operations.

AWSCI/CDInfrastructure as Code
0 likes · 21 min read
How One Engineer Runs a Full SaaS on Kubernetes with Minimal Effort
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Sep 5, 2024 · Artificial Intelligence

Deploying NVIDIA NIM on Alibaba Cloud ACK with Cloud‑Native AI Suite: A Step‑by‑Step Guide

This guide explains how to quickly build a high‑performance, observable, and elastically scalable LLM inference service by deploying NVIDIA NIM on an Alibaba Cloud ACK cluster using the Cloud‑Native AI Suite, KServe, Prometheus, Grafana, and custom autoscaling based on request‑queue metrics.

Alibaba Cloud ACKGrafanaKServe
0 likes · 15 min read
Deploying NVIDIA NIM on Alibaba Cloud ACK with Cloud‑Native AI Suite: A Step‑by‑Step Guide
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
May 31, 2024 · Cloud Native

Best Practices for Deploying AI Model Inference on Knative

This guide explains how to efficiently deploy AI model inference services on Knative by externalizing model data, using Fluid for accelerated loading, configuring secrets, ImageCache, graceful shutdown, probes, autoscaling parameters, mixed ECS/ECI resources, shared GPU scheduling, and observability features to achieve fast scaling, low cost, and high elasticity.

AI Model InferenceBest PracticesGPU
0 likes · 19 min read
Best Practices for Deploying AI Model Inference on Knative
DeWu Technology
DeWu Technology
Dec 27, 2023 · Cloud Native

DeWu's Cloud-Native Container Management Practices

Since August 2021, DeWu App has built a cloud‑native, multi‑cluster Kubernetes platform that uses an OAM‑style CloneSet model, Helm‑generated resources, Karmada‑based federation, custom scheduler plugins for reservation and node‑balance, offline mixing for Flink, a unified KubeAutoScaler, and a self‑built KubeAI stack, achieving significant cost cuts and improved stability while planning further middleware containerization and multi‑cloud expansion.

AICost Managementautoscaling
0 likes · 22 min read
DeWu's Cloud-Native Container Management Practices
DevOps Cloud Academy
DevOps Cloud Academy
Aug 29, 2023 · Cloud Native

Achieving Zero‑Downtime Applications with Kubernetes

This article explains why and how to use Kubernetes features such as multiple pod replicas, PodDisruptionBudgets, deployment strategies, health probes, graceful termination, anti‑affinity, resource limits, and autoscaling to build zero‑downtime, highly available applications.

Deployment StrategiesHealth ProbesPod Disruption Budget
0 likes · 12 min read
Achieving Zero‑Downtime Applications with Kubernetes
AntTech
AntTech
Jul 14, 2023 · Cloud Native

KapacityStack: Open‑Source Cloud‑Native Intelligent Capacity Management and IHPA

KapacityStack is an open‑source, cloud‑native capacity platform from Ant Group that introduces the Intelligent Horizontal Pod Autoscaler (IHPA) to provide predictive, multi‑level, and stable autoscaling, reducing resource waste, carbon emissions, and operational costs while supporting extensible, modular integration with Kubernetes workloads.

autoscalingcapacity-managementcloud-native
0 likes · 11 min read
KapacityStack: Open‑Source Cloud‑Native Intelligent Capacity Management and IHPA
Java Architect Essentials
Java Architect Essentials
Jun 13, 2023 · Cloud Native

Zero‑Downtime Deployment with K8s and SpringBoot: Health Checks, Rolling Updates, Graceful Shutdown, Autoscaling, Prometheus Integration, and Config Separation

This article demonstrates how to achieve zero‑downtime releases for SpringBoot applications on Kubernetes by configuring readiness/liveness probes, rolling update strategies, graceful shutdown hooks, horizontal pod autoscaling, Prometheus monitoring, and externalized configuration via ConfigMaps.

ConfigMapHealthCheckRollingUpdate
0 likes · 13 min read
Zero‑Downtime Deployment with K8s and SpringBoot: Health Checks, Rolling Updates, Graceful Shutdown, Autoscaling, Prometheus Integration, and Config Separation
Cloud Native Technology Community
Cloud Native Technology Community
Feb 7, 2023 · Cloud Native

Machine Learning‑Based Optimization of Kubernetes Resources

This article explains how machine learning can be applied to automatically optimize CPU and memory settings in Kubernetes clusters, covering both experiment‑driven and observation‑driven approaches, step‑by‑step procedures, best‑practice recommendations, and the benefits of combining both methods for efficient, scalable cloud‑native operations.

Performanceautoscalingcloud-native
0 likes · 11 min read
Machine Learning‑Based Optimization of Kubernetes Resources
HelloTech
HelloTech
Dec 23, 2022 · Cloud Native

Design Principles and Implementation Details of Kubernetes Horizontal Pod Autoscaler and Custom Water Pod Autoscaler

The article explains Kubernetes’ built‑in Horizontal Pod Autoscaler, then details the custom Water Pod Autoscaler (WPA) that extends HPA with dual‑signal (load and SOA registration) detection, dual‑threshold scaling, noise filtering, configurable cooldown, frequency limits, tolerance buffers, and integrated alerting for reliable elastic scaling.

HPAScaling AlgorithmsWPA
0 likes · 13 min read
Design Principles and Implementation Details of Kubernetes Horizontal Pod Autoscaler and Custom Water Pod Autoscaler
Tencent Cloud Developer
Tencent Cloud Developer
Nov 24, 2022 · Cloud Native

Large‑Scale Cost Optimization for Kubernetes/TKE: Data Collection, Measures, and Implementation

The article details a Tencent‑led, end‑to‑end cost‑optimization project for large‑scale Kubernetes/TKE clusters that collected extensive workload metrics, applied VPA/HPA enhancements, custom scheduling and node‑downscaling via the open‑source Crane platform, ultimately delivering up to 70% CPU and 50% memory savings with zero‑fault deployments.

Cost OptimizationHPAKubernetes
0 likes · 29 min read
Large‑Scale Cost Optimization for Kubernetes/TKE: Data Collection, Measures, and Implementation
AntTech
AntTech
Nov 10, 2022 · Cloud Computing

DeepScaling: An Automated Capacity Evaluation System for Stable CPU Utilization in Large‑Scale Cloud Services

DeepScaling is a deep‑learning‑driven autoscaling framework that predicts workload, estimates CPU usage, and makes reinforcement‑learning‑based scaling decisions to keep microservice CPU utilization at a target level, thereby reducing resource waste while meeting SLOs in large‑scale cloud environments.

Resource Managementautoscalingcloud computing
0 likes · 21 min read
DeepScaling: An Automated Capacity Evaluation System for Stable CPU Utilization in Large‑Scale Cloud Services
Efficient Ops
Efficient Ops
Nov 2, 2022 · Cloud Native

Why Your HPA Isn’t Scaling: 3 Common Misconceptions and How to Fix Them

This article explains three frequent misunderstandings about Kubernetes Horizontal Pod Autoscaler—dead zones, misuse of utilization calculations, and perceived lag in scaling—while detailing HPA’s inner workings, metric sources, calculation methods, and behavior configuration to help you avoid scaling pitfalls.

HPAPerformanceautoscaling
0 likes · 12 min read
Why Your HPA Isn’t Scaling: 3 Common Misconceptions and How to Fix Them
Practical DevOps Architecture
Practical DevOps Architecture
Sep 20, 2022 · Cloud Native

Kubernetes Advantages, Use Cases, Features, Drawbacks, and Core Concepts

This article outlines Kubernetes' main advantages such as container orchestration, lightweight design, open‑source nature, elastic scaling and load balancing, describes typical deployment scenarios, highlights its portability, extensibility and automation, lists current drawbacks, and explains fundamental components like master, node, pod, labels, controllers, services, volumes, and namespaces.

Container OrchestrationDeploymentautoscaling
0 likes · 5 min read
Kubernetes Advantages, Use Cases, Features, Drawbacks, and Core Concepts
Xingsheng Youxuan Technology Community
Xingsheng Youxuan Technology Community
Aug 18, 2022 · Cloud Native

Unlocking 800% Node Overselling: Xingdou Cloud’s Smart Resource Strategies

This article details how Xingdou Cloud leverages cloud‑native techniques such as massive node overselling, custom HPA (SophonHPA), priority‑based QoS, intelligent cleanup, and quota management to achieve dramatic cost reduction and efficiency gains across its multi‑cloud platform.

Cost OptimizationResource Managementautoscaling
0 likes · 18 min read
Unlocking 800% Node Overselling: Xingdou Cloud’s Smart Resource Strategies
AntTech
AntTech
Jun 22, 2022 · Cloud Computing

Meta Reinforcement Learning Framework for Predictive Autoscaling in Cloud Environments

This article presents a cloud-native, end‑to‑end autoscaling solution that integrates traffic forecasting, CPU utilization meta‑prediction, and a reinforcement‑learning‑based scaling decision module into a fully differentiable system, achieving higher resource utilization and cost efficiency as demonstrated by ACM SIGKDD 2022 research.

autoscalingcapacity-managementcloud computing
0 likes · 10 min read
Meta Reinforcement Learning Framework for Predictive Autoscaling in Cloud Environments
DataFunTalk
DataFunTalk
May 21, 2022 · Big Data

Exploring and Implementing Elastic Scheduling for Xiaomi Hadoop YARN

This talk presents Xiaomi's design and deployment of an elastic scheduling system for Hadoop YARN, covering background analysis, resource‑pool strategy, auto‑scaling architecture, stability challenges, label‑based resource isolation, Spark shuffle handling, cost‑saving results and future plans.

Big DataHadoopResource Management
0 likes · 16 min read
Exploring and Implementing Elastic Scheduling for Xiaomi Hadoop YARN
HomeTech
HomeTech
Mar 16, 2022 · Cloud Native

Understanding Kubernetes Horizontal Pod Autoscaler (HPA): Mechanism, Core Source Code, and Practical Insights

This article explains how Kubernetes Horizontal Pod Autoscaler (HPA) balances resource demand and workload by automatically scaling pod replicas, describes the different metric types it supports, walks through the core controller code (Run, worker, reconcile, and replica calculation), highlights current limitations, and shares practical observations from real‑world usage.

GoHorizontal Pod Autoscalerautoscaling
0 likes · 11 min read
Understanding Kubernetes Horizontal Pod Autoscaler (HPA): Mechanism, Core Source Code, and Practical Insights
Ctrip Technology
Ctrip Technology
Dec 30, 2021 · Cloud Computing

Ctrip’s Practice of Using AWS Spot Instances for Cost Reduction and High Availability

This article details Ctrip’s large‑scale use of AWS Spot instances on Kubernetes, explaining the cost benefits, the challenges of spot interruptions, and the architectural and operational strategies—including multi‑AZ deployment, scheduling policies, autoscaling group design, and observability—that enable a 50% reduction in container costs while maintaining system stability and reliability.

AWS SpotCost OptimizationHigh Availability
0 likes · 13 min read
Ctrip’s Practice of Using AWS Spot Instances for Cost Reduction and High Availability