Tagged articles
3287 articles
Page 21 of 33
Tencent Cloud Developer
Tencent Cloud Developer
Sep 27, 2020 · Operations

Elasticsearch Cluster Capacity Planning, Index Configuration, and Performance Optimization

This guide outlines practical capacity‑planning, index‑design, and write‑performance tuning for Tencent Cloud Elasticsearch clusters, covering compute and storage sizing, optimal shard counts, rollover strategies, bulk API settings, health monitoring, and common troubleshooting steps to ensure stable, high‑throughput search services.

Cluster PlanningElasticsearchOperations
0 likes · 19 min read
Elasticsearch Cluster Capacity Planning, Index Configuration, and Performance Optimization
MaGe Linux Operations
MaGe Linux Operations
Sep 25, 2020 · Operations

Discover Spug: A Lightweight, Agentless Automation Platform for Small Teams

Spug is an open‑source, agent‑less automation operations platform designed for small‑to‑medium enterprises, offering host management, batch command execution, online terminals, file transfer, application deployment, task scheduling, configuration, monitoring and alerting, with easy Docker installation and a rich web UI.

DockerOperationsSpug
0 likes · 6 min read
Discover Spug: A Lightweight, Agentless Automation Platform for Small Teams
DevOps Cloud Academy
DevOps Cloud Academy
Sep 25, 2020 · Operations

Understanding DevOps, SecOps, and DevSecOps: Definitions, Benefits, and Choosing the Right Approach

This guide explains the concepts of DevOps, SecOps, and DevSecOps, outlines their respective benefits, and helps organizations decide which security‑focused operational model best fits their needs by comparing their focus on integration, automation, and collaboration across development, operations, and security teams.

CollaborationDevOpsDevSecOps
0 likes · 6 min read
Understanding DevOps, SecOps, and DevSecOps: Definitions, Benefits, and Choosing the Right Approach
Alibaba Cloud Native
Alibaba Cloud Native
Sep 24, 2020 · Cloud Native

Tackling Ultra‑Large‑Scale Service Mesh Deployment: Lessons from Alibaba

This article details Alibaba's practical experience deploying Service Mesh at massive scale, covering architectural evolution, key challenges, traffic interception, hot‑upgrade mechanisms, performance optimizations, and operational tooling that together enable reliable, low‑overhead service communication in a cloud‑native environment.

EnvoyIstioOperations
0 likes · 22 min read
Tackling Ultra‑Large‑Scale Service Mesh Deployment: Lessons from Alibaba
Programmer DD
Programmer DD
Sep 24, 2020 · Operations

Why 58% of IT Professionals Say Windows 10 Updates Are Useless

A recent Computerworld survey reveals that a majority of IT staff find Windows 10's twice‑yearly updates either useless or of little value, with many preferring older Windows versions and criticizing forced update policies.

OperationsPatch managementWindows
0 likes · 3 min read
Why 58% of IT Professionals Say Windows 10 Updates Are Useless
JD.com Experience Design Center
JD.com Experience Design Center
Sep 23, 2020 · Operations

Boost B2B Operations Efficiency with Template‑Based Design

B‑end operational activities often involve frequent, short‑term, high‑pressure tasks that drain design resources, so this article explains how generic design templates and collaborative online tools can streamline these demands, freeing up manpower and improving overall operational efficiency.

B2BOperationsdesign templates
0 likes · 2 min read
Boost B2B Operations Efficiency with Template‑Based Design
Laravel Tech Community
Laravel Tech Community
Sep 22, 2020 · Databases

Common Redis Latency Issues and How to Diagnose Them

This article explains why Redis latency can suddenly increase—covering high‑complexity commands, large keys, concentrated expirations, memory limits, fork overhead, CPU binding, AOF settings, swap usage, and network saturation—and provides practical diagnostic steps and mitigation techniques.

DatabaseLatencyOperations
0 likes · 17 min read
Common Redis Latency Issues and How to Diagnose Them
58UXD
58UXD
Sep 22, 2020 · Operations

How Flexible Staffing and Digital Transformation Can Revive Post‑Pandemic SMEs

The article explores how small and medium‑sized enterprises can recover from pandemic setbacks by adopting flexible employment models, leveraging digital tools for management and customer insight, and shifting to stronger online promotion while controlling costs and improving resilience.

Digital TransformationOperationsSME
0 likes · 9 min read
How Flexible Staffing and Digital Transformation Can Revive Post‑Pandemic SMEs
Alibaba Cloud Native
Alibaba Cloud Native
Sep 21, 2020 · Operations

Why Chaos Engineering Is Essential for Cloud‑Native High Availability

This article explains the need for chaos engineering in modern distributed and cloud‑native systems, outlines the challenges faced by architects, developers, testers and product teams, and provides step‑by‑step guidance on using ChaosBlade and Alibaba's AHAS platform for effective fault‑injection experiments.

Operationschaos engineeringcloud-native
0 likes · 9 min read
Why Chaos Engineering Is Essential for Cloud‑Native High Availability
High Availability Architecture
High Availability Architecture
Sep 21, 2020 · Operations

Full‑Link Load Testing Practices for iQIYI Payment System

This article describes iQIYI's payment team approach to full‑link load testing, covering background challenges, systematic problem exploration, preparation of test environments, traffic modeling, execution safeguards, practical results, and future plans to improve capacity verification and system reliability.

Operationscapacity planningfull‑link testing
0 likes · 10 min read
Full‑Link Load Testing Practices for iQIYI Payment System
MaGe Linux Operations
MaGe Linux Operations
Sep 18, 2020 · Operations

Essential Linux Operations Metrics for Effective Monitoring

This guide enumerates the key Linux system metrics—covering CPU, memory, disk, I/O, network, kernel parameters, RAID, SMART, NTP, and process information—that open-falcon agents collect every minute to enable comprehensive operations monitoring and timely issue detection.

MetricsOpen-FalconOperations
0 likes · 12 min read
Essential Linux Operations Metrics for Effective Monitoring
Efficient Ops
Efficient Ops
Sep 13, 2020 · Operations

Master Nginx: Reverse Proxy, Load Balancing, and High‑Availability Essentials

This guide explains Nginx’s core concepts—including reverse proxy, load balancing, static‑dynamic separation, common commands, configuration blocks, and high‑availability setup with Keepalived—providing step‑by‑step examples and practical diagrams for reliable web service deployment.

Operationsconfigurationhigh availability
0 likes · 11 min read
Master Nginx: Reverse Proxy, Load Balancing, and High‑Availability Essentials
TAL Education Technology
TAL Education Technology
Sep 10, 2020 · Cloud Native

Accelerating Project Deployment with a Container Platform and Domain Convergence

This article describes how the infrastructure team reduced new project deployment time to under an hour by combining a container platform with domain convergence, detailing the processes, automation pipelines, Kubernetes-based deployment, autoscaling, logging, and security considerations for efficient, cloud‑native operations.

Deployment AutomationOperationscloud-native
0 likes · 17 min read
Accelerating Project Deployment with a Container Platform and Domain Convergence
Efficient Ops
Efficient Ops
Sep 9, 2020 · Operations

Mastering Incident Management: Core Principles and Practical Methods

This guide outlines essential incident management principles—prioritizing business restoration and timely escalation—followed by detailed methodologies such as restart, isolation, and degradation, and explains role responsibilities, user impact handling, and post‑incident summarization for continuous improvement.

Incident ManagementOperationsfault handling
0 likes · 10 min read
Mastering Incident Management: Core Principles and Practical Methods
Efficient Ops
Efficient Ops
Sep 8, 2020 · Operations

From Firefighting to Arson: Mastering Ops Availability in Three Stages

The article outlines a three‑stage ops maturity model—firefighting, fire prevention, and arson—explains how proactive fault‑injection drills, continuous availability improvements, and aligning technical metrics with business value can transform operations from reactive responders into strategic value creators.

AvailabilityFault InjectionIncident Management
0 likes · 8 min read
From Firefighting to Arson: Mastering Ops Availability in Three Stages
58UXD
58UXD
Sep 7, 2020 · Operations

Designing a High‑Impact Brand‑Driven Operation Campaign on a Tight Timeline

This article details how, despite limited resources and time, a product team designed and executed the “Part‑time Gold Rush” operation—defining goals, targeting young users, building a memorable brand, applying 5W1H strategy, leveraging AARRR growth tactics, and achieving revenue and traffic targets.

AARRROperationsbrand design
0 likes · 9 min read
Designing a High‑Impact Brand‑Driven Operation Campaign on a Tight Timeline
dbaplus Community
dbaplus Community
Sep 6, 2020 · Operations

Building a High‑Performance Monitoring Alert System with Akka, Dubbo, and Ignite

The article outlines G Bank’s transition from a single‑threaded commercial monitoring solution to a self‑developed, open‑source based alert system that leverages Akka for parallel collection, Apache Dubbo for distributed processing, and Apache Ignite for in‑memory storage, achieving million‑level alert capacity, sub‑100 ms latency, and linear scalability.

AkkaApache DubboApache Ignite
0 likes · 17 min read
Building a High‑Performance Monitoring Alert System with Akka, Dubbo, and Ignite
Efficient Ops
Efficient Ops
Sep 3, 2020 · Operations

What Recent Cloud and Data Center Incidents Reveal About Industry Risks?

A roundup of recent tech news covering a Cisco sabotage case, a London data‑center fire, Linux's 29th anniversary, Gartner's China ICT trends, major cloud investments, Windows 95 milestones, Didi's GPU server launch, Hainan's DNS project, Dell’Oro's market report, executive share reductions, and an upcoming global operations conference.

Cloud ComputingData CenterGPU
0 likes · 10 min read
What Recent Cloud and Data Center Incidents Reveal About Industry Risks?
Efficient Ops
Efficient Ops
Sep 2, 2020 · Operations

Why Consistent Shell Script Standards Matter: A Practical Guide

This guide explains the importance of shell script coding standards, outlines core principles such as correctness, readability, maintainability, and consistency, and provides detailed recommendations on file naming, encoding, line length, indentation, comments, testing, and safe use of commands to improve script quality and reduce maintenance costs.

BashOperationscoding standards
0 likes · 26 min read
Why Consistent Shell Script Standards Matter: A Practical Guide
MaGe Linux Operations
MaGe Linux Operations
Aug 30, 2020 · Operations

How to Seamlessly Upgrade Nginx from 1.16 to 1.18 with Zero Downtime

This guide walks through verifying the existing Nginx 1.16.1 process, compiling and configuring Nginx 1.18.0 with identical options, performing a zero‑downtime binary replacement, and handling rollback procedures using signals and process management commands on a Linux server.

OperationsServer AdministrationUpgrade
0 likes · 14 min read
How to Seamlessly Upgrade Nginx from 1.16 to 1.18 with Zero Downtime
Architecture Digest
Architecture Digest
Aug 30, 2020 · Cloud Native

Migrating Docker Images, Containers, and Volumes: Practical Techniques

This article explains how to migrate Docker images, containers, and data volumes using save/load, export/import, and backup/restore commands, offering practical steps for offline environments, complex production services, and volume handling while highlighting the limitations of conventional approaches.

Container MigrationOperationsVolume Backup
0 likes · 7 min read
Migrating Docker Images, Containers, and Volumes: Practical Techniques
Tencent Cloud Developer
Tencent Cloud Developer
Aug 28, 2020 · Databases

Automating Data Balancing for ClickHouse Clusters on Tencent Cloud

Tencent Cloud’s managed ClickHouse service now includes an automated data‑balancing feature that, after user authorization and bandwidth configuration, creates migration plans to redistribute tables across new or decommissioned nodes, eliminating manual rebalancing, reducing operational overhead, and ensuring balanced storage during elastic scaling.

ClickHouseDatabaseOperations
0 likes · 8 min read
Automating Data Balancing for ClickHouse Clusters on Tencent Cloud
Laravel Tech Community
Laravel Tech Community
Aug 25, 2020 · Operations

NetBox 2.9.1 Release Highlights and New Features

NetBox 2.9.1, an IP address and data center infrastructure management tool built on Django and PostgreSQL, introduces several enhancements including SLAAC address status, nested LAG support, version details on error pages, and a backward‑compatible remote authentication backend parameter.

DCIMDjangoIPAM
0 likes · 2 min read
NetBox 2.9.1 Release Highlights and New Features
Efficient Ops
Efficient Ops
Aug 25, 2020 · Operations

How to Build an Enterprise‑Grade Observability System and Master Incident Response

This article explains how enterprises adopting SRE can design a comprehensive observability platform—covering metrics, logs, and tracing—while also detailing effective incident response, post‑mortem practices, testing, capacity planning, automation tool development, and user‑experience focus to improve overall operational reliability.

Incident ResponseObservabilityOperations
0 likes · 17 min read
How to Build an Enterprise‑Grade Observability System and Master Incident Response
DevOps Cloud Academy
DevOps Cloud Academy
Aug 25, 2020 · Operations

A Simple Four‑Step Process for Prioritizing DevOps Work

This article outlines a practical four‑step process—Define, Scope, Experiment, Analyze—to help DevOps engineers prioritize automation tasks, assess pain points, and align improvements with business value, offering actionable guidance for effective pipeline and workflow optimization.

DevOpsOperationsautomation
0 likes · 6 min read
A Simple Four‑Step Process for Prioritizing DevOps Work
Ops Development Stories
Ops Development Stories
Aug 25, 2020 · Operations

ESrally Guide: Install, Configure, and Benchmark Elasticsearch Performance

ESrally is the official Elasticsearch benchmarking tool; this guide walks through its installation prerequisites, step‑by‑step setup of Python, JDK, and Git, configuration of tracks, cars, pipelines, and challenges, and demonstrates real‑world performance comparisons across Elasticsearch versions and hardware platforms.

BenchmarkingESrallyElasticsearch
0 likes · 16 min read
ESrally Guide: Install, Configure, and Benchmark Elasticsearch Performance
DevOps
DevOps
Aug 25, 2020 · Operations

IDCF Phase 5 DevOps Case Study: Traditional Banking Practice and Lessons Learned

This article details a month‑long DevOps case study conducted by the IDCF team on traditional banking, describing the four guiding principles, the six‑stage workflow from team formation to retrospection, the research findings across major Chinese banks, and the resulting best‑case award and future digital‑transformation discussions.

DevOpsFinTechOperations
0 likes · 7 min read
IDCF Phase 5 DevOps Case Study: Traditional Banking Practice and Lessons Learned
Aikesheng Open Source Community
Aikesheng Open Source Community
Aug 24, 2020 · Operations

Prometheus Data Query Basics and Practical Usage Guide

This article introduces Prometheus' query language PromQL, explains instant and range vector selectors, label matching, offset handling, storage design, common functions and aggregation operators, and provides practical advice for efficient querying and avoiding performance issues.

OperationsPromQLPrometheus
0 likes · 13 min read
Prometheus Data Query Basics and Practical Usage Guide
DevOps Cloud Academy
DevOps Cloud Academy
Aug 22, 2020 · Operations

Common Mistakes in DevOps Implementation and How to Avoid Them

The article outlines ten frequent pitfalls that organizations encounter when adopting DevOps—such as out‑of‑order delivery, misunderstandings of DevOps roles, lack of flexibility, speed over quality, isolated teams, unautomated databases, insufficient incident handling, limited expertise, security neglect, and team fatigue—and provides practical guidance to prevent these errors for more successful DevOps outcomes.

Continuous DeliveryDevOpsOperations
0 likes · 11 min read
Common Mistakes in DevOps Implementation and How to Avoid Them
DevOps Cloud Academy
DevOps Cloud Academy
Aug 20, 2020 · Operations

How DevOps Can Reduce Technical Debt During Cloud Migration

This article explains what technical debt is, why it accumulates in both development and operations, and outlines four DevOps‑driven strategies—including building cross‑functional teams, automation, containerization, and API‑centric design—to identify, track, and repay technical debt while improving cloud migration outcomes.

ContainersDevOpsOperations
0 likes · 10 min read
How DevOps Can Reduce Technical Debt During Cloud Migration
Efficient Ops
Efficient Ops
Aug 19, 2020 · Operations

How End-State‑Oriented Monitoring Transforms Operations and AIOps

This article explains the concept of end‑state‑oriented monitoring, its significance for modern operations, the shortcomings of existing solutions, and a layered design approach that leverages real‑time data, service catalogs, and AI to achieve secure, stable, efficient, and low‑cost operations.

DevOpsOperationsaiops
0 likes · 13 min read
How End-State‑Oriented Monitoring Transforms Operations and AIOps
Senior Brother's Insights
Senior Brother's Insights
Aug 19, 2020 · Operations

Essential Ops Lessons: Avoid Disasters with Backups, Monitoring, and Secure Practices

This guide shares hard‑earned lessons from real‑world server administration, emphasizing careful testing, confirming commands before execution, limiting simultaneous operators, always backing up configurations, protecting data, tightening SSH and firewall security, implementing comprehensive monitoring, and applying disciplined performance‑tuning practices to maintain stable, reliable services.

OperationsPerformance tuningSystem Administration
0 likes · 12 min read
Essential Ops Lessons: Avoid Disasters with Backups, Monitoring, and Secure Practices
dbaplus Community
dbaplus Community
Aug 17, 2020 · Operations

Master Server Troubleshooting: Diagnose, Optimize, and Keep Your Backend Stable

This article shares practical experience on backend troubleshooting, outlining common failure types, a step‑by‑step diagnosis workflow, essential tools, and systematic optimization techniques for performance, stability and maintainability, helping engineers quickly stop losses, pinpoint root causes, and implement robust fixes.

BackendOperationsmaintainability
0 likes · 21 min read
Master Server Troubleshooting: Diagnose, Optimize, and Keep Your Backend Stable
Open Source Linux
Open Source Linux
Aug 17, 2020 · Operations

Step-by-Step Guide to Install and Configure Zabbix on CentOS 7

This tutorial walks you through installing Zabbix on CentOS 7, covering prerequisite disabling of SELinux and firewalls, adding repositories, installing server, web, and database components, configuring files, securing MariaDB, starting services, and completing the web‑based setup with language customization.

CentOSInstallationOperations
0 likes · 7 min read
Step-by-Step Guide to Install and Configure Zabbix on CentOS 7
FunTester
FunTester
Aug 15, 2020 · Operations

Why Quality Management Is Critical for Project Success

This article explains the importance of quality management in projects, outlines its two main dimensions—process quality and product quality—details the multiple benefits of systematic quality control, and provides an eight‑step framework for creating an effective quality management plan.

OperationsProcess ImprovementQA
0 likes · 5 min read
Why Quality Management Is Critical for Project Success
DevOps Cloud Academy
DevOps Cloud Academy
Aug 13, 2020 · Operations

Integrating DevOps Toolchains for Enterprise‑Scale End‑to‑End Communication and Collaboration

The article explains how integrating DevOps toolchains can achieve enterprise‑scale end‑to‑end communication and collaboration without forcing teams to change their workflows, discusses common bottlenecks, presents unified versus loosely‑coupled integration approaches, and offers practical recommendations for building an inclusive, interconnected DevOps ecosystem.

CollaborationDevOpsOperations
0 likes · 10 min read
Integrating DevOps Toolchains for Enterprise‑Scale End‑to‑End Communication and Collaboration
DevOps Cloud Academy
DevOps Cloud Academy
Aug 12, 2020 · Operations

10 International Companies That Successfully Transformed to DevOps in 2020

This article reviews ten well‑known enterprises—including Adidas, Capital One, Verizon, Disney, and Starbucks—that have undertaken large‑scale DevOps and cloud‑native transformations, detailing the challenges they faced, the cultural and technical changes implemented, and the measurable business benefits achieved.

DevOpsDigital TransformationOperations
0 likes · 13 min read
10 International Companies That Successfully Transformed to DevOps in 2020
Efficient Ops
Efficient Ops
Aug 11, 2020 · Operations

How Multi‑Cloud Disaster Recovery Boosts Site Availability: Lessons from Real‑World DR Drills

This article shares a detailed case study of building multi‑cloud site disaster‑recovery and fault‑drill practices at Kaixin Network, covering high‑availability concepts, architectural redesign, pain points, automated one‑click switching, and future self‑healing with chaos engineering to improve reliability.

Operationsdisaster recoveryfault drills
0 likes · 15 min read
How Multi‑Cloud Disaster Recovery Boosts Site Availability: Lessons from Real‑World DR Drills
Java Architect Essentials
Java Architect Essentials
Aug 11, 2020 · Operations

Four Essential Linux Monitoring Tools for Operations Engineers

This article introduces four widely used Linux monitoring tools—iotop, htop, IPTraf, and Monit—explaining their features, usage scenarios, and how they help operations engineers diagnose performance issues without a GUI, including real‑time I/O tracking, visual CPU/memory graphs, network traffic analysis, and flexible alerting.

IPTrafMonitOperations
0 likes · 7 min read
Four Essential Linux Monitoring Tools for Operations Engineers
IT Architects Alliance
IT Architects Alliance
Aug 6, 2020 · Operations

Eight Essential Steps for Successful Disaster Recovery Drills

This guide outlines eight practical steps—including defining scope, forming a planning team, setting clear objectives, designing realistic scenarios, creating evaluation checklists, assigning roles, conducting pre‑drill briefings, and performing post‑drill reviews—to help organizations execute effective, repeatable disaster recovery exercises that strengthen business continuity.

Best PracticesOperationsPlanning
0 likes · 9 min read
Eight Essential Steps for Successful Disaster Recovery Drills
StarRing Big Data Open Lab
StarRing Big Data Open Lab
Jul 28, 2020 · Operations

How DevOps and SRE Transform Modern Software Delivery and Operations

This article explains the evolution from traditional C/S to B/S architectures, compares DevOps and SRE principles, discusses their roles in the container and cloud eras, and showcases StarRing's TDC platform that integrates automated pipelines, monitoring, and deployment for efficient software delivery.

Cloud ComputingContainerizationDevOps
0 likes · 14 min read
How DevOps and SRE Transform Modern Software Delivery and Operations
Xianyu Technology
Xianyu Technology
Jul 28, 2020 · Operations

ShenTan: Automated Fault Localization System for Online Services

ShenTan is an automated fault‑localization platform for online services that quickly (under five seconds) pinpoints server‑side issues with developer‑level accuracy by aggregating real‑time metrics, applying a decision‑tree model enriched by expert knowledge and dynamic thresholds, and presenting results through an integrated alert and visualization system, while planning broader endpoint coverage and multi‑tenant support.

Big DataFault LocalizationOperations
0 likes · 12 min read
ShenTan: Automated Fault Localization System for Online Services
IT Architects Alliance
IT Architects Alliance
Jul 27, 2020 · Operations

Why Tape Backup Is Failing and How Disk Backup Can Save Your Data

The article analyzes the growing limitations of tape backup, outlines a step‑by‑step migration to disk‑based backup using deduplication, compression and modern storage technologies, and explains how this transition improves reliability, cost efficiency and recovery speed for enterprises.

Data ProtectionDeduplicationOperations
0 likes · 11 min read
Why Tape Backup Is Failing and How Disk Backup Can Save Your Data
Zhongtong Tech
Zhongtong Tech
Jul 25, 2020 · Operations

How ZTO Express Leveraged Technology to Become China’s Logistics Leader

This presentation details ZTO Express’s rapid rise from a modest startup to the world’s largest courier by exploring its technology‑driven business model, crowd‑funded expansion, electronic waybills, smart routing, AI customer service, employee equity schemes, and future digital logistics strategies.

AIBusiness ModelLogistics
0 likes · 27 min read
How ZTO Express Leveraged Technology to Become China’s Logistics Leader
Open Source Linux
Open Source Linux
Jul 23, 2020 · Operations

5 Essential Steps to Become a Successful DevOps Engineer

This article outlines the five key practices—adopting a developer mindset, mastering system engineering, gaining cloud experience, learning containers, and developing soft skills—required to become an effective DevOps engineer in today’s rapidly evolving tech landscape.

Cloud ComputingContainersDevOps
0 likes · 6 min read
5 Essential Steps to Become a Successful DevOps Engineer
dbaplus Community
dbaplus Community
Jul 20, 2020 · Operations

How to Build Reliable Monitoring for Low‑Frequency Financial Services

After two years transitioning from e‑commerce to finance, the team shares practical monitoring strategies for low‑frequency financial services, contrasting e‑commerce traffic‑based methods with finance‑specific challenges, and detailing point‑based metrics, hourly success‑rate alerts, aspect‑oriented exception handling, white‑list filtering, and Sentinel‑based circuit breaking.

Aspect Oriented ProgrammingCircuit BreakingFinancial Services
0 likes · 16 min read
How to Build Reliable Monitoring for Low‑Frequency Financial Services
Full-Stack DevOps & Kubernetes
Full-Stack DevOps & Kubernetes
Jul 20, 2020 · Operations

Master Linux Network Monitoring with iftop and nethogs: Installation, Commands, and Tips

This guide explains how to install and use the Linux command‑line tools iftop and nethogs for real‑time network traffic monitoring, covering installation commands, interface selection, output interpretation, shortcut keys, and advanced options to help troubleshoot slow or blocked network connections.

CLINetwork MonitoringOperations
0 likes · 10 min read
Master Linux Network Monitoring with iftop and nethogs: Installation, Commands, and Tips
Swan Home Tech Team
Swan Home Tech Team
Jul 20, 2020 · Backend Development

Design and Evolution of a Reconciliation Center: From Version 1.0 to 3.0

This article introduces the concept, core capabilities, and architectural evolution of a reconciliation center—from its initial 1.0 design through 2.0 and 3.0 upgrades—highlighting problem statements, solution approaches, and the applicable scenarios that make it essential for large‑scale data consistency in modern micro‑service systems.

BackendOperationsReconciliation
0 likes · 14 min read
Design and Evolution of a Reconciliation Center: From Version 1.0 to 3.0
DevOps
DevOps
Jul 20, 2020 · Operations

Bank 4.0 DevOps Case Study: Practices, Challenges, and Solutions in Traditional Banking

This case study analyzes the Bank 4.0 transformation of traditional Chinese banks, detailing industry characteristics, historical challenges, open‑banking drivers, the ABCDII technology framework, DevOps tooling, metric systems, and a future ecosystem vision to guide digital and operational improvement.

AICloudComputingDevOps
0 likes · 18 min read
Bank 4.0 DevOps Case Study: Practices, Challenges, and Solutions in Traditional Banking
Qunhe Technology Quality Tech
Qunhe Technology Quality Tech
Jul 17, 2020 · Operations

How We Built a Robust Monitoring System for Construction Drawing Production

This article describes how our team designed and implemented a comprehensive online monitoring system for construction drawing generation, covering business background, technical architecture analysis, metric definition, monitoring methods, and the resulting dashboards that improve quality, stability, and rapid issue resolution.

MetricsOperationsconstruction drawing
0 likes · 10 min read
How We Built a Robust Monitoring System for Construction Drawing Production
DevOps
DevOps
Jul 17, 2020 · Operations

Agile vs DevOps: Understanding Their Overlap, Differences, and Evolution

This article explores the relationship between Agile and DevOps, explaining their origins, narrow and broad definitions, how they address gaps between business, development, and operations, and presenting a capability growth model that highlights continuous delivery and lean principles as shared goals.

Continuous DeliveryDevOpsLean
0 likes · 9 min read
Agile vs DevOps: Understanding Their Overlap, Differences, and Evolution
Youku Technology
Youku Technology
Jul 16, 2020 · Operations

How Alibaba Entertainment Automates Capacity Management and Elastic Scaling

Alibaba Entertainment transformed its capacity management from manual, experience‑based decisions to a fully automated system that continuously evaluates single‑machine performance, identifies performance and success‑rate breakpoints, and drives elastic scaling, dramatically improving resource utilization, availability, and development efficiency across all its applications.

Operationsautomationcapacity management
0 likes · 10 min read
How Alibaba Entertainment Automates Capacity Management and Elastic Scaling
MaGe Linux Operations
MaGe Linux Operations
Jul 14, 2020 · Operations

How Keepalived Enables High-Availability Load Balancing with VRRP

Keepalived, originally designed for LVS load balancing, provides VRRP-based high‑availability by managing LVS nodes, performing health checks, and offering failover for services like Nginx, HAProxy, and MySQL, while also addressing split‑brain scenarios and non‑preemptive configurations.

OperationsVRRPfailover
0 likes · 10 min read
How Keepalived Enables High-Availability Load Balancing with VRRP
Efficient Ops
Efficient Ops
Jul 13, 2020 · Operations

What 13,966 Ops Job Listings Reveal About Salary, Skills, and Hot Cities

This article analyzes 13,966 Chinese operations‑engineer job postings scraped from 51job, cleaning the data with Python and Pandas, then visualizing industry demand, city concentration, salary ranges, education requirements, company size distribution, and keyword trends to guide job seekers and recruiters.

Data cleaningData visualizationOperations
0 likes · 14 min read
What 13,966 Ops Job Listings Reveal About Salary, Skills, and Hot Cities
Architects Research Society
Architects Research Society
Jul 13, 2020 · Operations

A Digital Transformation Framework for Asset Management: Integrating Business, Culture, and Technology

The article presents a Digital Transformation Framework (DTF) that helps asset‑management firms model, evaluate, and implement disruptive digital strategies across front, middle, and back‑office functions, emphasizing composable enterprises, cultural change, BPaaS, API‑driven architectures, and value‑based prioritization to achieve sustainable competitive advantage.

APIBPaaSOperations
0 likes · 14 min read
A Digital Transformation Framework for Asset Management: Integrating Business, Culture, and Technology
Efficient Ops
Efficient Ops
Jul 12, 2020 · Operations

How Full-Path Packet Loss Monitoring Transforms Network Reliability

This article explains the concept of full‑path packet loss monitoring, its importance for banking networks, the causes of packet loss, and detailed technical implementations—including traffic splitting, collection, automatic analysis engines, TCP retransmission detection, and algorithms for pinpointing loss locations—to dramatically reduce troubleshooting time.

Network MonitoringOperationsPacket Loss
0 likes · 11 min read
How Full-Path Packet Loss Monitoring Transforms Network Reliability
iQIYI Technical Product Team
iQIYI Technical Product Team
Jul 10, 2020 · Operations

iQIYI IPv6 Large‑Scale Deployment: Technical Challenges, Solutions, and Management Practices

iQIYI’s IPv6 rollout, responding to the national deployment plan, coordinated multiple technical teams to redesign its network and introduced the “iQIYI IPv6 Cloud Control” scheme that manages IPv4/IPv6 switching and fallback, reaching more than 200 million active IPv6 users and 800 GB traffic peaks, guided by long‑term strategic value, clear milestones, and engineers’ curiosity to expand IPv6‑driven service quality and cost savings.

IPv6Operationscloud control
0 likes · 12 min read
iQIYI IPv6 Large‑Scale Deployment: Technical Challenges, Solutions, and Management Practices
转转QA
转转QA
Jul 9, 2020 · Operations

Testing Scenario Extraction and Tool Selection for Business Operations

The article explains how to isolate testing scenarios and choose appropriate testing methods for various business contexts—storefront changes, order processing, and cross‑platform integrations—by establishing baseline data, comparing results, and leveraging tools like YApi to improve quality and efficiency.

OperationsQATest Strategy
0 likes · 7 min read
Testing Scenario Extraction and Tool Selection for Business Operations
AntTech
AntTech
Jul 2, 2020 · Operations

Innovative Design and Implementation of the Barad‑Dur Custom Monitoring Dashboard

This article introduces the Barad‑Dur custom monitoring dashboard of Ant Monitoring, detailing its WYSIWYG editor, advanced interaction features, controller concept, extensible data‑source architecture, unified time‑series format, scene‑graph inspired layout engine, and future roadmap for cloud‑native observability.

DataSourceOperationsScene Graph
0 likes · 12 min read
Innovative Design and Implementation of the Barad‑Dur Custom Monitoring Dashboard
DevOps Cloud Academy
DevOps Cloud Academy
Jul 2, 2020 · Operations

Design and Extension of DevOps Platform Tasks Based on Jenkins Pipeline

This article explains how the PuYuan DevOps platform extends Jenkins pipeline tasks by categorizing atomic tasks, designing flexible database schemas for task templates and attributes, and implementing container-based environment isolation to support scalable, secure continuous integration and deployment across diverse enterprise environments.

ContainerizationDevOpsJenkins
0 likes · 10 min read
Design and Extension of DevOps Platform Tasks Based on Jenkins Pipeline
Big Data Technology & Architecture
Big Data Technology & Architecture
Jun 29, 2020 · Operations

Meizu's Automation Journey and Continuous Delivery Platform Evolution

The article outlines Meizu's transition from a music‑player company to a mobile and internet service provider, detailing the operational challenges faced across three internet eras, the development of a comprehensive automation and continuous delivery platform, and the role of big‑data‑driven insights in improving quality, efficiency, cost, and security.

Continuous DeliveryDevOpsOperations
0 likes · 14 min read
Meizu's Automation Journey and Continuous Delivery Platform Evolution
DevOps Coach
DevOps Coach
Jun 29, 2020 · Operations

How China’s DevOps Community Chose the Best SaaS Platform: GitLab vs Jira vs CODING

The Chinese DevOps community evaluated three SaaS platforms—GitLab (free), Jira Cloud (free), and CODING (Tencent Cloud DevOps)—against requirements such as private repository collaboration, OKR management, Scrum planning, CI/CD pipelines, artifact storage, and cloud deployment, ultimately concluding that CODING offers the most suitable integrated solution.

DevOpsGitLabOperations
0 likes · 15 min read
How China’s DevOps Community Chose the Best SaaS Platform: GitLab vs Jira vs CODING
Dual-Track Product Journal
Dual-Track Product Journal
Jun 27, 2020 · Operations

How Modern Procurement Management Systems Streamline Supply Chains

This article explains what procurement is, its strategic importance, the components and architecture of a procurement management system, detailed workflow steps, and key functions such as item management, order handling, pricing maintenance, and supplier returns, highlighting how effective procurement reduces costs and boosts competitiveness.

OperationsSupply Chaininventory
0 likes · 13 min read
How Modern Procurement Management Systems Streamline Supply Chains
DevOps Cloud Academy
DevOps Cloud Academy
Jun 27, 2020 · Operations

Linux Service and Process Management with Nginx

This guide explains how to install Nginx on a Linux server, manage it with systemctl commands, verify its operation using netstat, and control related processes via ps and kill utilities, providing practical command examples for each step.

OperationsServicelinux
0 likes · 3 min read
Linux Service and Process Management with Nginx
DevOps Cloud Academy
DevOps Cloud Academy
Jun 26, 2020 · Operations

Linux System User and Group Management Tutorial

This tutorial explains Linux user and group management, covering login prompts, user information commands, adding, modifying, and deleting users, switching users, password handling, file permission changes, and group administration with practical command examples and code snippets.

OperationsShell CommandsSystem Administration
0 likes · 7 min read
Linux System User and Group Management Tutorial
DevOps Cloud Academy
DevOps Cloud Academy
Jun 26, 2020 · Operations

Linux File and Directory Permission Management Tutorial

This tutorial explains Linux file and directory permission management, covering permission categories, how to view, add, revoke, and recursively apply permissions using commands such as ls, chmod, and demonstrates permission notation with examples.

OperationsSystem Administrationchmod
0 likes · 3 min read
Linux File and Directory Permission Management Tutorial
Qunar Tech Salon
Qunar Tech Salon
Jun 23, 2020 · Operations

A Simple Gray Release Solution for High‑Concurrency Flight Ticket Systems

This article presents a lightweight gray release approach for complex flight ticket services, comparing traditional hardware and soft‑routing isolation methods, describing the authors' traffic‑based gray identification, business‑focused monitoring, implementation details, and automated safeguards to enable safe incremental deployments.

BackendGray ReleaseOperations
0 likes · 8 min read
A Simple Gray Release Solution for High‑Concurrency Flight Ticket Systems
Suning Technology
Suning Technology
Jun 22, 2020 · Operations

How Suning Moved 26,888 Servers in 75 Days – Key Takeaways

Suning’s data center team completed a record-breaking migration of 26,888 servers across 75 days, detailing the planning, tight time windows, intensive communication, cross‑team coordination, risk management, and efficiency gains that enabled zero‑downtime migration and significant cost savings for future operations.

Cloud ComputingData CenterMigration
0 likes · 7 min read
How Suning Moved 26,888 Servers in 75 Days – Key Takeaways