Tag

Capacity Planning

1 views collected around this technical thread.

Old Zhao – Management Systems Only
Old Zhao – Management Systems Only
May 13, 2025 · Operations

Mastering Supply Chain Planning: Balancing Demand, Capacity, and Inventory with ERP

This article explains why many companies struggle with inaccurate plans, defines supply chain planning as the dynamic coordination of demand, capacity, and inventory, and provides a step‑by‑step ERP‑based framework—including demand forecasting, capacity analysis, inventory control, and execution—to achieve reliable, data‑driven operations.

Capacity PlanningERPdemand planning
0 likes · 10 min read
Mastering Supply Chain Planning: Balancing Demand, Capacity, and Inventory with ERP
Qunar Tech Salon
Qunar Tech Salon
Mar 27, 2025 · Operations

Automated Capacity Planning and Auto‑Scaling for Hotel Services During Traffic Peaks

This document describes a comprehensive capacity‑planning solution that predicts traffic‑peak impacts for hotel services, automatically estimates required CPU resources, creates timed scaling tasks, and evaluates performance using detailed metrics, thereby improving operational efficiency and reducing manual effort during events such as exam‑ticket printing and holiday travel surges.

Capacity PlanningResource Managementalgorithm
0 likes · 12 min read
Automated Capacity Planning and Auto‑Scaling for Hotel Services During Traffic Peaks
Bilibili Tech
Bilibili Tech
Mar 18, 2025 · Operations

Technical Practices for Ensuring Stability of Bilibili’s 2025 Spring Festival Gala Live Stream

Bilibili’s engineering team built a scenario‑metadata and one‑click fault‑drill platform, implemented multi‑tier degradation, dynamic capacity planning, and extensive automated fault‑injection testing to guarantee zero‑severity incidents during the high‑traffic 2025 Spring Festival Gala live stream.

Capacity PlanningFault InjectionLive Streaming
0 likes · 16 min read
Technical Practices for Ensuring Stability of Bilibili’s 2025 Spring Festival Gala Live Stream
High Availability Architecture
High Availability Architecture
Jan 13, 2025 · Operations

Comprehensive Guide to High‑Availability System Architecture and Practices

This article provides a systematic overview of high‑availability system design, covering availability metrics, fault prevention, detection, recovery, capacity planning, service tiering, data layer resilience, monitoring, and the responsibilities of architects, SREs, and developers to ensure reliable, scalable services.

Capacity PlanningHigh AvailabilitySystem Architecture
0 likes · 30 min read
Comprehensive Guide to High‑Availability System Architecture and Practices
Efficient Ops
Efficient Ops
Nov 19, 2024 · Operations

Mastering System Stability: Proven SRE Practices for Reliable, High‑Availability Services

This article explains how system stability depends on architecture and code details, defines SLA and the “nines” metric, outlines Google’s SRE hierarchy, and provides practical governance steps—including development and release processes, high‑availability design, capacity planning, monitoring, incident response, and team culture—to achieve reliable, high‑availability services.

Capacity PlanningHigh AvailabilitySRE
0 likes · 34 min read
Mastering System Stability: Proven SRE Practices for Reliable, High‑Availability Services
Bilibili Tech
Bilibili Tech
Oct 25, 2024 · Operations

Bilibili Data Center Migration: Planning, Execution, and Lessons Learned

Bilibili’s 18‑month, multi‑regional data‑center migration moved tens of thousands of servers using a high‑frequency rolling strategy, combining meticulous planning, cross‑team coordination, automated rack placement and rigorous checklists to achieve significant cost savings, higher utilization, improved stability and greener operations.

AutomationCapacity PlanningInfrastructure
0 likes · 21 min read
Bilibili Data Center Migration: Planning, Execution, and Lessons Learned
Efficient Ops
Efficient Ops
Jul 31, 2024 · Operations

How HuoLala Achieved Zero‑Fault Peaks: A Blueprint for High‑Load System Reliability

This article details HuoLala's three‑year journey of systematic business‑peak assurance, covering goal definition, project‑management practices, technical risk mitigation, cloud‑provider coordination, and post‑event reviews that together delivered zero‑fault high‑traffic periods and continuously improving system stability.

Capacity Planningoperationspeak load management
0 likes · 20 min read
How HuoLala Achieved Zero‑Fault Peaks: A Blueprint for High‑Load System Reliability
Efficient Ops
Efficient Ops
May 21, 2024 · Operations

What Is an SRE? Roles, Skills, and Best Practices Explained

This article demystifies Site Reliability Engineering (SRE) by explaining its origins, core responsibilities, essential skill sets, and key practices such as observability, incident response, testing, capacity planning, automation, user support, on‑call duties, and the definition of SLI/SLO/SLA, providing a comprehensive guide for modern operations teams.

AutomationCapacity PlanningIncident Response
0 likes · 29 min read
What Is an SRE? Roles, Skills, and Best Practices Explained
iQIYI Technical Product Team
iQIYI Technical Product Team
May 10, 2024 · Operations

Full‑Link Load Testing of iQIYI Playback Service: Process, Tools, and Outcomes

iQIYI implemented full‑link load testing of its playback service using LoadMaker for traffic generation and Rover for link control, mapping the topology, creating weighted user scenarios, and safely pressurizing production‑like environments, which validated multi‑times historical peak capacity, uncovered bottlenecks, and enabled several performance and disaster‑recovery improvements without impacting real users.

Capacity PlanningiQIYIload testing
0 likes · 10 min read
Full‑Link Load Testing of iQIYI Playback Service: Process, Tools, and Outcomes
Efficient Ops
Efficient Ops
Apr 14, 2024 · Operations

How to Ensure System Stability and High Availability: An SRE Playbook

This article explains the definitions of stability and high availability, distinguishes their relationship, outlines key performance indicators, and provides a comprehensive framework—including fault prevention, detection, and recovery, as well as design, coding, testing, monitoring, and emergency response practices—to help teams build reliable, highly available systems.

Capacity PlanningHigh AvailabilitySRE
0 likes · 10 min read
How to Ensure System Stability and High Availability: An SRE Playbook
Efficient Ops
Efficient Ops
Jan 31, 2024 · Operations

How ICBC Boosted System Stability with Advanced Performance Capacity Testing

This article details ICBC Software Development Center's comprehensive approach to performance capacity testing, covering background challenges, a structured quality practice plan, enhanced test scope evaluation, result analysis, tool support, implementation outcomes, and future directions for ensuring system stability and scalability.

AutomationCapacity Planningoperations
0 likes · 9 min read
How ICBC Boosted System Stability with Advanced Performance Capacity Testing
DevOps
DevOps
Jan 18, 2024 · R&D Management

Understanding Story Points and Agile Team Capacity Planning

This article explains the concept of story points as a relative estimation unit, why agile teams use them, how they are applied across Scrum ceremonies, and answers common questions about their relationship to effort, value, and managerial decision‑making.

Capacity PlanningR&D managementScrum
0 likes · 8 min read
Understanding Story Points and Agile Team Capacity Planning
JD Tech
JD Tech
Nov 16, 2023 · Operations

Preparing JD's CDP Platform for Double 11: Challenges, Capacity Planning, and Lessons Learned

This article recounts the author's experience preparing JD's Customer Data Platform (CDP) for the Double 11 shopping festival, detailing the platform's capabilities, business scenarios, capacity planning, stability and performance challenges, disaster‑recovery measures, and personal reflections on the intensive technical effort involved.

Big DataCDPCapacity Planning
0 likes · 12 min read
Preparing JD's CDP Platform for Double 11: Challenges, Capacity Planning, and Lessons Learned
AntTech
AntTech
Nov 8, 2023 · Artificial Intelligence

Kapacity V0.2 Release: AI‑Driven Traffic‑Based Replica Prediction for Cloud‑Native Autoscaling

Kapacity V0.2 introduces an AI‑powered, traffic‑driven replica prediction algorithm for cloud‑native autoscaling, featuring a Linear‑Residual model, a lightweight Swish Net time‑series forecaster, custom metric support, and open‑source tools, aiming to improve resource efficiency and reduce operational risk.

AICapacity PlanningPredictive Autoscaling
0 likes · 9 min read
Kapacity V0.2 Release: AI‑Driven Traffic‑Based Replica Prediction for Cloud‑Native Autoscaling
DataFunTalk
DataFunTalk
Aug 26, 2023 · Big Data

Ensuring Doris Stability in HuoLala's Big Data Platform: Practices and Lessons

This article presents HuoLala's practical approach to guaranteeing the stability of the Doris OLAP engine within its large‑scale big data platform, covering background, challenges, case studies, capability building, process standards, and future planning.

AutomationBig DataCapacity Planning
0 likes · 12 min read
Ensuring Doris Stability in HuoLala's Big Data Platform: Practices and Lessons
Code Ape Tech Column
Code Ape Tech Column
Jul 26, 2023 · Operations

Service Governance: Monitoring, Fault Management, Release and Capacity Planning

This article explains how to achieve 24/7 service availability through comprehensive monitoring, fault handling, release management, and capacity planning, covering alarm types, batch processing, traffic and resource metrics, fault causes and mitigation, deployment strategies, scaling commands, and service degradation techniques.

Capacity Planningfault managementmonitoring
0 likes · 20 min read
Service Governance: Monitoring, Fault Management, Release and Capacity Planning
Tencent Cloud Developer
Tencent Cloud Developer
Mar 13, 2023 · Cloud Computing

Design Principles for High‑Availability System Architecture

The article outlines a comprehensive high‑availability architecture framework across six layers—development standards, application services, storage, product fallback, operations deployment, and emergency response—detailing design principles such as stateless services, elastic scaling, redundant storage, robust monitoring, gray releases, and chaos engineering to ensure resilient, continuously available systems.

Capacity PlanningDeploymentHigh Availability
0 likes · 25 min read
Design Principles for High‑Availability System Architecture
Weimob Technology Center
Weimob Technology Center
Feb 3, 2023 · Operations

How Full‑Link Load Testing Became the Secret Weapon for E‑Commerce Mega‑Sales

This article explains how micro‑enterprise SaaS leader Weimeng built a full‑link load‑testing platform to simulate real‑world traffic for major shopping festivals, detailing the challenges, architecture, capabilities, results, and future plans for ensuring system stability and performance at scale.

Capacity PlanningJMetere-commerce
0 likes · 16 min read
How Full‑Link Load Testing Became the Secret Weapon for E‑Commerce Mega‑Sales
HelloTech
HelloTech
Jan 31, 2023 · Operations

Stability Assurance Practices for Large‑Scale Promotional Events

The article outlines a comprehensive stability‑assurance framework for large‑scale promotional events—detailing planning, capacity and pressure‑test rehearsals, strict change‑freeze, internal gray releases, coordinated on‑call response, thorough link and capacity analysis, monitoring, emergency procedures, cross‑team collaboration, external partner coordination, and post‑event review to ensure resilient system performance.

Capacity PlanningStabilitychange control
0 likes · 17 min read
Stability Assurance Practices for Large‑Scale Promotional Events
Architecture Digest
Architecture Digest
Dec 21, 2022 · Operations

Designing High‑Availability Systems: Principles and Practices Across Six Layers

This article systematically explores high‑availability system design from development standards, capacity planning, application services, storage, product strategies, operations deployment, to incident response, presenting key concepts, architectural patterns, and practical guidelines for building resilient services.

Capacity PlanningDeploymentHigh Availability
0 likes · 27 min read
Designing High‑Availability Systems: Principles and Practices Across Six Layers