Tagged articles
4063 articles
Page 18 of 41
Efficient Ops
Efficient Ops
May 8, 2023 · Operations

How Intelligent Ops Transforms Container Cloud Management at Scale

This article summarizes a speaker’s insights from GOPS 2023 on the challenges of large‑scale container cloud operations and presents a comprehensive intelligent‑ops framework—including health scoring, automated pod anomaly detection, smart scaling, and multi‑center disaster recovery—to improve visibility, efficiency, and reliability in Kubernetes environments.

CloudNativeIntelligentOpsautomation
0 likes · 18 min read
How Intelligent Ops Transforms Container Cloud Management at Scale
Tencent Cloud Developer
Tencent Cloud Developer
May 8, 2023 · Cloud Native

Troubleshooting Common Kubernetes Networking Issues: Cross-VPC NodePort Timeouts, LB Pressure Test CPS Low, DNS Delays, and More

This guide walks through eight frequent Kubernetes networking problems in Tencent Cloud Kubernetes Service—such as cross‑VPC NodePort timeouts, low load‑balancer CPS, DNS resolution delays, apiserver access lag, mis‑configured resolv.conf, liveness‑probe failures, and externalTrafficPolicy = Local timeouts—explaining their root causes and providing concrete kernel, iptables, DNS, and configuration fixes.

DNSLBTKE
0 likes · 29 min read
Troubleshooting Common Kubernetes Networking Issues: Cross-VPC NodePort Timeouts, LB Pressure Test CPS Low, DNS Delays, and More
Liangxu Linux
Liangxu Linux
May 7, 2023 · Cloud Native

Unlock Hidden kubectl Tricks: Advanced Commands for Kubernetes Mastery

This article presents a collection of advanced kubectl techniques—including API inspection, status‑based pod filtering and deletion, node‑specific pod listing, distribution counting, and proxy usage—to help experienced Kubernetes users solve ad‑hoc tasks more efficiently.

CLIOperationscommands
0 likes · 7 min read
Unlock Hidden kubectl Tricks: Advanced Commands for Kubernetes Mastery
Architects Research Society
Architects Research Society
May 7, 2023 · Cloud Computing

Running Kubernetes Across Multiple Zones: Design Principles and Operational Practices

This article explains how Kubernetes can be deployed across multiple failure zones and regions, covering control‑plane replication, node labeling, pod topology constraints, storage zone awareness, network considerations, and fault‑recovery strategies to achieve high availability and resilience.

Control Planecloud architecturekubernetes
0 likes · 8 min read
Running Kubernetes Across Multiple Zones: Design Principles and Operational Practices
Architects Research Society
Architects Research Society
May 2, 2023 · Cloud Computing

What Is Apache OpenWhisk? An Overview of the Open‑Source Serverless Platform

Apache OpenWhisk is an open‑source, distributed serverless platform that runs functions in Docker containers, supports multiple programming languages, can be deployed on various cloud or on‑premise environments such as Kubernetes, and offers seamless integration with popular services, scalable execution, and resource‑efficient operation.

Apache OpenWhiskDockerFunctions
0 likes · 7 min read
What Is Apache OpenWhisk? An Overview of the Open‑Source Serverless Platform
MaGe Linux Operations
MaGe Linux Operations
May 1, 2023 · Cloud Native

Unlock Hidden kubectl Tricks: Boost Your Kubernetes Workflow

This article shares a collection of practical kubectl commands and tips—including API debugging, pod filtering and deletion, node‑wise pod statistics, and proxy usage—to help Kubernetes users work more efficiently and avoid writing custom client code.

OperationsTipscloud-native
0 likes · 8 min read
Unlock Hidden kubectl Tricks: Boost Your Kubernetes Workflow
Alibaba Cloud Native
Alibaba Cloud Native
May 1, 2023 · Cloud Native

Deploy FastChat on Alibaba Cloud ASK: A Serverless AI Model Tutorial

This guide shows how to quickly deploy the open‑source FastChat AI assistant on Alibaba Cloud ASK's serverless Kubernetes platform, covering prerequisites, YAML configuration, GPU handling, verification steps, and three usage scenarios including web UI, API calls, and a VSCode extension.

AIASKFastChat
0 likes · 12 min read
Deploy FastChat on Alibaba Cloud ASK: A Serverless AI Model Tutorial
ITPUB
ITPUB
Apr 26, 2023 · Information Security

Detecting CDK Attacks with Kubernetes Audit Logs: Practical Rules and Pitfalls

This article explains how to enable Kubernetes audit logging, analyzes CDK‑based attack behaviors captured in audit logs, provides concrete detection rules for information collection, exploitation, and privilege escalation, and shares practical lessons learned when deploying audit‑driven security in cloud‑native environments.

CDKContainerThreat Detection
0 likes · 18 min read
Detecting CDK Attacks with Kubernetes Audit Logs: Practical Rules and Pitfalls
Alibaba Cloud Native
Alibaba Cloud Native
Apr 25, 2023 · Cloud Native

How KubeVela Scales: Load‑Testing Results and Performance Optimizations for v1.8

This guide details KubeVela's three‑year evolution, presents a comprehensive load‑testing history, explains step‑by‑step configuration for high‑performance and robust control planes, describes various optimization techniques such as state‑persistence parallelism, AppKey indexing, informer cache reduction, direct cluster‑gateway connections and controller sharding, and summarizes extensive single‑shard, multi‑shard, multi‑cluster and large‑scale experiments that demonstrate v1.8's superior scalability and stability.

Controller ShardingKubeVelaScalability
0 likes · 32 min read
How KubeVela Scales: Load‑Testing Results and Performance Optimizations for v1.8
dbaplus Community
dbaplus Community
Apr 22, 2023 · Cloud Native

How Vivo Scales Multi‑Data‑Center Kubernetes with a Custom Operator

This article details Vivo's approach to managing thousands of Kubernetes nodes across multiple data centers by developing a declarative Kubernetes‑Operator, modular Ansible scripts, and a comprehensive CI matrix to automate deployment, scaling, upgrades, and fault recovery while reducing operational risk.

AnsibleMulti-Data CenterOperator
0 likes · 13 min read
How Vivo Scales Multi‑Data‑Center Kubernetes with a Custom Operator
Open Source Linux
Open Source Linux
Apr 21, 2023 · Cloud Native

Mastering Kubernetes Architecture: How Control Plane and Worker Nodes Work Together

This article explains the core components of Kubernetes architecture—including the control plane (etcd, API server, controller manager, scheduler) and worker node components (kubelet, kube-proxy, container runtimes)—detailing their roles, interactions, and best‑practice considerations for maintaining healthy, scalable clusters.

Control PlaneSchedulerWorker Nodes
0 likes · 12 min read
Mastering Kubernetes Architecture: How Control Plane and Worker Nodes Work Together
Cloud Native Technology Community
Cloud Native Technology Community
Apr 20, 2023 · Cloud Native

Understanding Kubernetes kube‑scheduler Architecture, Workflow, and Plugin Development

This article explains the role of kube‑scheduler in Kubernetes, details its scheduling process, describes the plugin‑based framework with extension points such as PreEnqueue, Filter and Bind, and provides complete code examples and deployment instructions for building custom scheduler plugins.

SchedulerScheduling Frameworkkubernetes
0 likes · 33 min read
Understanding Kubernetes kube‑scheduler Architecture, Workflow, and Plugin Development
Selected Java Interview Questions
Selected Java Interview Questions
Apr 19, 2023 · Operations

Zero‑Downtime Deployment with Kubernetes and Spring Boot: Health Checks, Rolling Updates, Graceful Shutdown, Autoscaling, Prometheus Monitoring, and Config Separation

This guide explains how to achieve zero‑downtime releases of a Spring Boot application on Kubernetes by configuring readiness/liveness probes, rolling‑update strategies, graceful shutdown, horizontal pod autoscaling, Prometheus metrics collection, and externalized configuration via ConfigMaps.

ConfigMapPrometheusSpring Boot
0 likes · 11 min read
Zero‑Downtime Deployment with Kubernetes and Spring Boot: Health Checks, Rolling Updates, Graceful Shutdown, Autoscaling, Prometheus Monitoring, and Config Separation
Open Source Linux
Open Source Linux
Apr 19, 2023 · Cloud Native

What’s New in Kubernetes v1.27? Key Features, Upgrades & Deprecations

Kubernetes v1.27, the first 2023 release, introduces 60 enhancements across Alpha, Beta and Stable stages, freezes the old k8s.gcr.io registry in favor of registry.k8s.io, promotes several security and scheduling features to stable, and removes numerous legacy flags and feature gates.

Deprecationscloud-nativefeatures
0 likes · 11 min read
What’s New in Kubernetes v1.27? Key Features, Upgrades & Deprecations
Architects Research Society
Architects Research Society
Apr 18, 2023 · Cloud Native

Apache Camel: An Enterprise Integration Framework Growing in Importance and Expanding to Cloud‑Native Kubernetes Deployments

The article highlights Apache Camel’s rising relevance for enterprise integration, its extensive protocol support, deployment flexibility—including native Kubernetes options with Camel K and Camel Quarkus—while noting strong community activity and endorsement from European Commission developers.

Apache Camelenterprise integrationkubernetes
0 likes · 7 min read
Apache Camel: An Enterprise Integration Framework Growing in Importance and Expanding to Cloud‑Native Kubernetes Deployments
Alibaba Cloud Native
Alibaba Cloud Native
Apr 18, 2023 · Artificial Intelligence

How to Deploy a CPU‑Based Stable Diffusion Service on Alibaba Cloud ACK

This guide walks you through the prerequisites, step‑by‑step console and kubectl procedures, YAML configuration, and post‑deployment verification needed to run a CPU‑only Stable Diffusion model on Alibaba Cloud Container Service (ACK) and optionally switch to a GPU‑enabled version.

ACKAI model deploymentCPU
0 likes · 7 min read
How to Deploy a CPU‑Based Stable Diffusion Service on Alibaba Cloud ACK
Bilibili Tech
Bilibili Tech
Apr 18, 2023 · Cloud Native

Kubernetes Audit Log Analysis for Container Security

The article explains how to enable Kubernetes audit logging and use its detailed fields—such as userAgent, responseStatus, requestURI, and object references—to detect CDK‑generated attacks and other threats like CVE‑2022‑3172, privilege escalation, and backdoor deployment, offering practical detection examples and security recommendations.

API ServerAudit LoggingCDK
0 likes · 15 min read
Kubernetes Audit Log Analysis for Container Security
Efficient Ops
Efficient Ops
Apr 17, 2023 · Operations

Mastering Container Log Collection in Kubernetes: Strategies and Best Practices

This article explains how container log collection in Kubernetes differs from traditional host logging, outlines common deployment methods such as DaemonSet and Sidecar, compares log storage options, and offers practical guidance on handling stdout and file‑based logs for reliable operations.

DaemonSetcontainer loggingkubernetes
0 likes · 12 min read
Mastering Container Log Collection in Kubernetes: Strategies and Best Practices
Alibaba Cloud Native
Alibaba Cloud Native
Apr 17, 2023 · Cloud Native

OpenKruise v1.4 Highlights: Sidecar Terminator and CloneSet Enhancements

The OpenKruise v1.4 release introduces the Job Sidecar Terminator for automatic sidecar shutdown, enables several stable capabilities by default, adds CloneSet performance and lifecycle improvements, provides a force‑recreate option for containers, and enhances image pre‑pull metadata handling, all while offering clear usage examples and configuration snippets.

CloneSetContainerJob
0 likes · 10 min read
OpenKruise v1.4 Highlights: Sidecar Terminator and CloneSet Enhancements
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 17, 2023 · Operations

How to Break Through Scale‑Out Ops Bottlenecks in the Cloud‑Native Era

This article analyzes the three main bottlenecks—stability, cost, and efficiency—encountered in large‑scale operations, presents a six‑stage pipeline and open‑source toolchain, and explains how cloud‑native technologies such as Kubernetes and AIOps can transform and automate massive infrastructure management.

Scalabilityaiopscloud-native
0 likes · 18 min read
How to Break Through Scale‑Out Ops Bottlenecks in the Cloud‑Native Era
Efficient Ops
Efficient Ops
Apr 16, 2023 · Cloud Native

Mastering Kubernetes Probes: Liveness, Readiness, and Startup Explained

This article explains why Kubernetes health probes are essential, describes the three probe types (liveness, readiness, startup), their checking methods, configuration options, provides complete YAML examples, demonstrates testing scenarios, and outlines additional mechanisms that ensure container availability in a cloud‑native environment.

ContainerProbesStartup
0 likes · 14 min read
Mastering Kubernetes Probes: Liveness, Readiness, and Startup Explained
System Architect Go
System Architect Go
Apr 16, 2023 · Cloud Native

Understanding and Implementing Kubernetes Admission Controllers with a Sidecar Injection Example

This article explains the purpose and phases of Kubernetes Admission Controllers, outlines their security, governance, and configuration management benefits, and provides a step‑by‑step guide—including TLS certificate creation, a Go HTTPS webhook server, and MutatingWebhookConfiguration YAML—to inject a sidecar container into pods.

AdmissionControllerSidecarInjectionTLS
0 likes · 11 min read
Understanding and Implementing Kubernetes Admission Controllers with a Sidecar Injection Example
Wukong Talks Architecture
Wukong Talks Architecture
Apr 16, 2023 · Cloud Native

Bosideng’s Cloud‑Native Transformation: Containerization, Microservices, and Full‑Link Traffic Governance

The article details Bosideng’s multi‑year digital transformation, describing how the apparel company migrated its legacy systems to cloud‑native architectures using Kubernetes, containerization, unified microservices, and Alibaba Cloud MSE to achieve zero‑loss deployment, traffic governance, and accelerated business innovation.

ContainerizationDigital TransformationMSE
0 likes · 27 min read
Bosideng’s Cloud‑Native Transformation: Containerization, Microservices, and Full‑Link Traffic Governance
360 Quality & Efficiency
360 Quality & Efficiency
Apr 14, 2023 · Cloud Native

Ensuring Zero‑Downtime Rolling Updates in Kubernetes: Causes and Solutions

This article analyzes why Kubernetes rolling updates can still cause service interruptions during pod startup and termination, explains the underlying mechanisms of Kubelet and Endpoint Controller, and provides practical steps such as readiness probes and preStop hooks to achieve smoother, near‑zero‑downtime deployments.

Readiness ProbeZero Downtimekubernetes
0 likes · 7 min read
Ensuring Zero‑Downtime Rolling Updates in Kubernetes: Causes and Solutions
Weimob Technology Center
Weimob Technology Center
Apr 13, 2023 · Backend Development

How Weimob Boosted API Performance with APISIX: A Deep Dive

This article details Weimob's migration to APISIX, covering background, performance requirements, benchmark results, architectural analysis, Kubernetes deployment, custom plugin extensions for authentication and rate limiting, remaining challenges, and overall conclusions about the gateway's impact.

APISIXLuaPerformance Optimization
0 likes · 14 min read
How Weimob Boosted API Performance with APISIX: A Deep Dive
Efficient Ops
Efficient Ops
Apr 12, 2023 · Operations

Building Highly Available Prometheus Monitoring with Thanos: A Practical Guide

This article explains why native Prometheus HA solutions fall short for large, multi‑region clusters and shows how to use Thanos components—including sidecar, query, store gateway, and compactor—to achieve long‑term storage, unlimited scaling, a global view, and non‑intrusive integration with existing Prometheus deployments.

ObservabilityPrometheusThanos
0 likes · 22 min read
Building Highly Available Prometheus Monitoring with Thanos: A Practical Guide
Cloud Native Technology Community
Cloud Native Technology Community
Apr 12, 2023 · Cloud Native

Kubernetes v1.27 Release Highlights: New Features, Enhancements, and Deprecations

Kubernetes v1.27, the first 2023 release, introduces 60 enhancements—including image registry migration, SeccompDefault stabilization, Job mutable scheduling GA, DownwardAPIHugePages GA, and numerous beta-to-stable upgrades—while also deprecating several legacy features and providing links for full changelog and download.

Release Noteskubernetesv1.27
0 likes · 12 min read
Kubernetes v1.27 Release Highlights: New Features, Enhancements, and Deprecations
Full-Stack DevOps & Kubernetes
Full-Stack DevOps & Kubernetes
Apr 11, 2023 · Cloud Native

Master Kubernetes Basics: Deploy, Scale, and Update Apps with Simple Commands

This article introduces Kubernetes as an open‑source container orchestration platform, explains its core objects like Pods, Services, ReplicaSets, and Deployments, clarifies its relationship with Docker, and provides a step‑by‑step example covering deployment, exposure, scaling, rolling updates, and rollback using kubectl commands.

DevOpsScalingService
0 likes · 5 min read
Master Kubernetes Basics: Deploy, Scale, and Update Apps with Simple Commands
Alibaba Cloud Native
Alibaba Cloud Native
Apr 10, 2023 · Cloud Native

How CNStack Enables Full Lifecycle Management of Cloud Services and Components

This article provides a detailed overview of CNStack 2.0, explaining its cloud‑service and cloud‑component model, the cn‑app‑operator lifecycle controller, Sealer‑based build/share/run workflow, and the ability‑center white‑screen management that together simplify multi‑cluster cloud‑native application delivery.

CNStackMulti-ClusterOperator
0 likes · 12 min read
How CNStack Enables Full Lifecycle Management of Cloud Services and Components
New Oriental Technology
New Oriental Technology
Apr 7, 2023 · Cloud Native

Capo Project: Cloud‑Native Network Coordination Service – Deployment, Configuration, Testing, and CI/CD Guide

This article provides a comprehensive guide to the open‑source Capo cloud‑native network coordination service, covering its architecture, three deployment methods (Helm, Kustomize, plain YAML), detailed configuration parameters, observability setup, static code analysis with golangci‑lint, extensive unit and e2e testing using Kind, Helm chart packaging, registry publishing, and a full GitHub Actions CI/CD workflow.

Goci/cdcloud-native
0 likes · 26 min read
Capo Project: Cloud‑Native Network Coordination Service – Deployment, Configuration, Testing, and CI/CD Guide
Bitu Technology
Bitu Technology
Apr 7, 2023 · Cloud Native

Managing Kubernetes Resource Manifests with Kustomize: Aggregation, Overlays, and Components

This article explains how Tubi’s engineering team uses Kustomize to simplify and scale Kubernetes Resource Manifest management by aggregating resources, applying patches, organizing bases and overlays, and leveraging reusable components to reduce duplication and improve maintainability across clusters and namespaces.

ComponentKustomizeOverlay
0 likes · 15 min read
Managing Kubernetes Resource Manifests with Kustomize: Aggregation, Overlays, and Components
Alibaba Cloud Native
Alibaba Cloud Native
Apr 6, 2023 · Cloud Native

How KubeVela Implements the Open Application Model for Cloud‑Native Platform Engineering

An overview of the CNCF’s platform engineering whitepaper highlights how KubeVela adopts the Open Application Model (OAM) to bridge developers and infrastructure, detailing components, traits, CUE templating, workflow, and management features, with practical examples and future directions in cloud‑native application delivery.

CUEKubeVelaOAM
0 likes · 14 min read
How KubeVela Implements the Open Application Model for Cloud‑Native Platform Engineering
Efficient Ops
Efficient Ops
Apr 3, 2023 · Cloud Native

How to Secure Multi‑Tenant Kubernetes Clusters: Best Practices & Architecture

This article explains the concept of multi‑tenant Kubernetes clusters, outlines common enterprise scenarios, and details native security mechanisms such as RBAC, NetworkPolicy, PodSecurityPolicy, OPA, resource quotas, and dedicated nodes to achieve effective isolation and protect sensitive data.

NetworkPolicyRBACcloud-native
0 likes · 12 min read
How to Secure Multi‑Tenant Kubernetes Clusters: Best Practices & Architecture
System Architect Go
System Architect Go
Apr 3, 2023 · Cloud Native

Why Cilium Beats Flannel: Real‑World Kubernetes Networking Insights

The article analyzes how Cilium’s eBPF‑based architecture, advanced network policies, cluster‑wide traffic control, and observability tools like Hubble solved performance, security, and scalability challenges that Flannel and kube‑proxy could not meet in production Kubernetes environments.

CNICiliumNetworkPolicy
0 likes · 12 min read
Why Cilium Beats Flannel: Real‑World Kubernetes Networking Insights
System Architect Go
System Architect Go
Mar 31, 2023 · Cloud Native

Understanding CPU Requests and Limits in Kubernetes

This article explains how Kubernetes uses CPU requests and limits to schedule pods, allocate CPU proportionally, calculate minimal request units, and provides practical guidelines for setting appropriate request and limit values based on workload characteristics and monitoring data.

Limitskubernetesrequests
0 likes · 6 min read
Understanding CPU Requests and Limits in Kubernetes
21CTO
21CTO
Mar 31, 2023 · Backend Development

Boost Go Performance: 6 Proven Techniques for Faster, Leaner Apps

This article presents six practical Go performance optimizations—including GOMAXPROCS tuning for Kubernetes, struct field ordering, garbage‑collection limits, zero‑copy unsafe conversions, jsoniter usage, and sync.Pool pooling—that together can dramatically lower CPU, memory, and latency in production services.

Garbage CollectionGoMemory Optimization
0 likes · 9 min read
Boost Go Performance: 6 Proven Techniques for Faster, Leaner Apps
DataFunSummit
DataFunSummit
Mar 30, 2023 · Artificial Intelligence

An Overview of ChatGPT’s Software Architecture and Technology Stack

The article examines ChatGPT’s underlying software architecture, detailing its cloud deployment on AWS and Azure, database choices like PostgreSQL and Redis, front‑end technologies such as TypeScript and React, core AI frameworks including PyTorch and Triton, as well as its container orchestration, monitoring, and programming language ecosystem.

AI architectureChatGPTDatabases
0 likes · 6 min read
An Overview of ChatGPT’s Software Architecture and Technology Stack
Cloud Native Technology Community
Cloud Native Technology Community
Mar 30, 2023 · Cloud Native

Kubernetes List/Watch, Informer Mechanism, and Writing Controllers for Pods and Custom Resources

This article explains how Kubernetes uses the List/Watch API and the Informer client library to monitor resources, compares direct HTTP Watch with Informer, provides Go code examples for pod controllers, shared informers, custom CRD controllers, and introduces higher‑level frameworks such as controller‑runtime and Kubebuilder.

CloudNativeControllerCustomResource
0 likes · 49 min read
Kubernetes List/Watch, Informer Mechanism, and Writing Controllers for Pods and Custom Resources
Cloud Native Technology Community
Cloud Native Technology Community
Mar 29, 2023 · Cloud Native

Kubernetes v1.27 Deprecations, API Removals, and Feature Gate Changes

Version 1.27 of Kubernetes introduces numerous deprecations and removals, including the migration of k8s.gcr.io to registry.k8s.io, the elimination of several API versions and feature gates such as CSIStorageCapacity, seccomp annotations, and various volume expansion options, with guidance for maintainers on required updates.

API RemovalFeature Gatesdeprecation
0 likes · 12 min read
Kubernetes v1.27 Deprecations, API Removals, and Feature Gate Changes
Cloud Native Technology Community
Cloud Native Technology Community
Mar 28, 2023 · Cloud Native

How to Set Up Multi‑Cluster Networking with Kube‑OVN OVN‑IC

This guide explains how to enable cross‑cluster pod communication in Kubernetes using Kube‑OVN's OVN‑IC feature, covering prerequisites, single‑node and high‑availability database deployment, automatic and manual route configuration, and cleanup procedures with concrete Docker/Containerd commands and ConfigMap examples.

Kube-OVNMulti-ClusterOVN-IC
0 likes · 15 min read
How to Set Up Multi‑Cluster Networking with Kube‑OVN OVN‑IC
System Architect Go
System Architect Go
Mar 27, 2023 · Cloud Native

Understanding Kubernetes Endpoint Propagation and Graceful Pod Deletion

Deleting a pod triggers endpoint removal, but various components like kube-proxy, CoreDNS, and ingress controllers may still route traffic until the endpoint fully propagates, so you must wait or use preStop hooks to delay deletion and handle SIGTERM gracefully within the configurable shutdown period.

Endpoint PropagationGraceful ShutdownPod Deletion
0 likes · 5 min read
Understanding Kubernetes Endpoint Propagation and Graceful Pod Deletion
System Architect Go
System Architect Go
Mar 23, 2023 · Cloud Native

Directly Accessing the Kubernetes API with curl and Custom Code

This article explains how to bypass kubectl and interact directly with the Kubernetes API using curl or any programming language, covering API discovery, request construction, resource listing, watching, and modifying objects, while illustrating concepts with JavaScript examples and shared informers.

APIcloud-nativecurl
0 likes · 4 min read
Directly Accessing the Kubernetes API with curl and Custom Code
ITPUB
ITPUB
Mar 23, 2023 · Cloud Native

Scaling Zhongtong Cloud: From Single‑Cluster to Multi‑Cluster Governance

Drawing from Yang Xiaofei’s SACC2022 talk, this article details Zhongtong Cloud’s two‑year journey from initial containerization to a multi‑cluster architecture, covering challenges, custom scheduler extensions, fixed‑IP handling, container crash‑site preservation, node rebalancing, application migration, cross‑cluster load balancing, and future plans for unified gateways.

ContainerizationMulti-Clustercloud-native
0 likes · 13 min read
Scaling Zhongtong Cloud: From Single‑Cluster to Multi‑Cluster Governance
Huolala Tech
Huolala Tech
Mar 23, 2023 · Cloud Native

How Huolala Built a Cloud‑Native One‑Stop AI Platform on Kubernetes

Huolala’s Big Data Intelligent Platform team describes how they built a cloud‑native, one‑stop AI solution on Kubernetes, integrating Flink‑based feature engineering, a multi‑tenant Zeppelin notebook, GPU‑aware training, and a unified model‑serving platform, while addressing resource isolation, storage persistence, and cross‑cloud deployment.

AI platformGPU schedulingModel Serving
0 likes · 17 min read
How Huolala Built a Cloud‑Native One‑Stop AI Platform on Kubernetes
System Architect Go
System Architect Go
Mar 22, 2023 · Information Security

Understanding Anonymous Access in Kubernetes API Server and How to Disable It

The article explains how Kubernetes clusters can permit anonymous API access via the --anonymous-auth flag, describes the authentication‑authorization‑admission flow, shows common RBAC bindings that enable this access, discusses its prevalence, and provides practical steps to disable anonymous access in both self‑managed and managed clusters.

Anonymous AccessRBACkubernetes
0 likes · 7 min read
Understanding Anonymous Access in Kubernetes API Server and How to Disable It
Efficient Ops
Efficient Ops
Mar 21, 2023 · Operations

How Hupu Scaled to Millions: Inside the Flex Auto‑Scaling Platform

This article details Hupu's massive sports‑traffic environment, the design and implementation of the Flex auto‑scaling platform, its architecture, core functions such as resource statistics, node and pod scaling, scenario scheduling, and the performance optimizations that enable rapid, cost‑effective scaling across multi‑cloud Kubernetes clusters.

Auto ScalingPerformance Optimizationcloud-native operations
0 likes · 15 min read
How Hupu Scaled to Millions: Inside the Flex Auto‑Scaling Platform
Alibaba Cloud Native
Alibaba Cloud Native
Mar 21, 2023 · Cloud Native

How OpenYurt Enables Edge Autonomy on Unstable Networks

This article explains how OpenYurt extends Kubernetes to handle edge scenarios with unreliable or disconnected networks by introducing YurtHub caching, a centralized heartbeat proxy, and node‑binding mechanisms that keep workloads running and avoid unwanted pod eviction.

OpenYurtYurtHubcloud-native
0 likes · 10 min read
How OpenYurt Enables Edge Autonomy on Unstable Networks
System Architect Go
System Architect Go
Mar 21, 2023 · Cloud Native

Understanding and Using Kubernetes Volume Snapshots

This article explains the concepts, architecture, configuration, and practical use cases of Kubernetes volume snapshots, including how to define snapshot classes, create snapshots, clone PVCs, and perform consistent backups across different storage providers and clusters.

CSICloudNativeVolumeSnapshot
0 likes · 11 min read
Understanding and Using Kubernetes Volume Snapshots
System Architect Go
System Architect Go
Mar 20, 2023 · Cloud Native

Secure Kubernetes Secrets: Comparing Sealed Secrets, External Secrets Operator, and CSI Driver

This article explains why native Kubernetes Secrets are insufficiently protected, introduces three open‑source solutions—Sealed Secrets, External Secrets Operator, and Secrets Store CSI Driver—covers their architecture, installation steps, usage examples, advantages, drawbacks, and provides practical code snippets for managing secrets safely in Git‑backed clusters.

CSI DriverExternal Secrets OperatorSealed Secrets
0 likes · 20 min read
Secure Kubernetes Secrets: Comparing Sealed Secrets, External Secrets Operator, and CSI Driver
Architecture Digest
Architecture Digest
Mar 20, 2023 · Cloud Native

Kubernetes: What It Is and Why It’s Hard to Get Started

This article provides a concise, question‑and‑answer overview of Kubernetes, explaining its role as a distributed container‑orchestration system, the architecture of master and worker nodes, core components such as etcd, kube‑apiserver, scheduler, controllers, and how services, pods, labels, and scaling operate within a cluster.

Cluster ManagementControllersPods
0 likes · 8 min read
Kubernetes: What It Is and Why It’s Hard to Get Started
Efficient Ops
Efficient Ops
Mar 19, 2023 · Cloud Native

Master Real-Time Multi-Pod Logging in Kubernetes with Kubetail & Stern

This guide introduces two lightweight Kubernetes log‑tailing utilities, Kubetail and Stern, explaining their installation on various platforms, core command‑line options, and practical usage examples for aggregating and color‑coding logs from multiple pods and containers, offering a simpler alternative to heavyweight logging stacks.

CLIkuberneteskubetail
0 likes · 10 min read
Master Real-Time Multi-Pod Logging in Kubernetes with Kubetail & Stern
ITPUB
ITPUB
Mar 16, 2023 · Cloud Native

How Kindling Leverages eBPF for Minute‑Level Fault Diagnosis in Cloud‑Native Environments

The interview with Kindling founder Cheng Chan explores how eBPF‑based Kindling tackles the overwhelming metrics, high expertise barrier, and lack of real‑time protocol parsing in cloud‑native observability, detailing its probe architecture, protocol analysis, and roadmap for faster, standardized root‑cause detection.

KindlingTrace Profilingcloud-native
0 likes · 13 min read
How Kindling Leverages eBPF for Minute‑Level Fault Diagnosis in Cloud‑Native Environments
Alibaba Cloud Native
Alibaba Cloud Native
Mar 16, 2023 · Cloud Native

How Koordinator Supercharges ACK Container Scheduling and Resource Efficiency

Koordinator, an open‑source cloud‑native scheduler from Alibaba, enhances container performance and reduces cluster costs by introducing mixed‑workload placement, resource profiling, load‑aware scheduling, and differentiated SLO mixing, now fully integrated into Alibaba Cloud ACK with a new v1.1.1‑ack.1 release.

ACKKoordinatorcloud-native
0 likes · 10 min read
How Koordinator Supercharges ACK Container Scheduling and Resource Efficiency
Hulu Beijing
Hulu Beijing
Mar 16, 2023 · Artificial Intelligence

Inside Hulu’s Distributed Training Platform: Architecture, Challenges, and Solutions

This article explores Hulu’s five‑year‑old machine‑learning training platform, detailing its three‑layer architecture, the shift from single‑node to distributed training, and the technical solutions—including parameter servers, Ring AllReduce, Kubernetes, Volcano, and Horovod—that enable scalable AI workloads across GPU, CPU, and storage resources.

AI InfrastructureHuluMachine Learning Platform
0 likes · 13 min read
Inside Hulu’s Distributed Training Platform: Architecture, Challenges, and Solutions
Cloud Native Technology Community
Cloud Native Technology Community
Mar 16, 2023 · Cloud Native

How Intel and F5 Enabled Dual‑Stack Support in Istio 1.17

This article details the collaborative effort between Intel and F5 to redesign and implement dual‑stack networking support in Istio 1.17, covering the background challenges, new RFC design, key Envoy changes, step‑by‑step experimental setup, listener and endpoint modifications, and ways for the community to contribute.

Dual-StackEnvoyIstio
0 likes · 11 min read
How Intel and F5 Enabled Dual‑Stack Support in Istio 1.17
JD Cloud Developers
JD Cloud Developers
Mar 15, 2023 · Operations

Designing Seamless Offline Delivery for Private Cloud Environments

This article outlines a general, process‑focused approach to offline delivery in private or dedicated cloud environments, covering the need for internal mirrors, plug‑in architecture, dependency awareness, full automation, and best‑practice process design to reduce SRE effort and ensure consistent production.

Operationsautomationkubernetes
0 likes · 5 min read
Designing Seamless Offline Delivery for Private Cloud Environments
Tencent Cloud Middleware
Tencent Cloud Middleware
Mar 14, 2023 · Cloud Native

How a Logistics SaaS Company Scaled to Millions Using Cloud‑Native Microservices

This article examines how the Chinese logistics SaaS firm HaiGuanJia leveraged cloud‑native technologies—Kubernetes, service mesh, and microservice frameworks—to overcome rapid user growth, improve development efficiency, enable gray releases, and smoothly migrate legacy systems while maintaining stability and agility.

LogisticsSaaScloud-native
0 likes · 16 min read
How a Logistics SaaS Company Scaled to Millions Using Cloud‑Native Microservices
DevOps Cloud Academy
DevOps Cloud Academy
Mar 14, 2023 · Cloud Native

Kustomize Tutorial: Managing Kubernetes Manifests Without Helm

This article introduces Kustomize as a native Kubernetes tool that replaces Helm, explains its declarative philosophy, and provides step‑by‑step examples for building base resources, creating overlays, applying patches, generating secrets, and updating images using simple command‑line operations.

DevOpsKustomizecloud-native
0 likes · 13 min read
Kustomize Tutorial: Managing Kubernetes Manifests Without Helm
New Oriental Technology
New Oriental Technology
Mar 10, 2023 · Cloud Native

Middleware PaaS on Kubernetes: Architecture, Benefits, and IP Reservation Challenges

This article explains how the New Oriental architecture team migrated middleware services like Redis, Kafka, and RocketMQ to Kubernetes, detailing the benefits over traditional PaaS, the Capo IP reservation solution for network stability, and the resulting operational, observability, and resource utilization improvements.

ObservabilityPaaScloud-native
0 likes · 18 min read
Middleware PaaS on Kubernetes: Architecture, Benefits, and IP Reservation Challenges
Alibaba Cloud Native
Alibaba Cloud Native
Mar 10, 2023 · Cloud Native

Uncovering the Root Causes of ACK Cluster Network Latency: kubelet, softirq, and cgroup Insights

A detailed post‑mortem explains how excessive cgroup files, kubelet's sys‑CPU usage, soft‑interrupt scheduling delays, and a buggy page‑free routine caused intermittent hundreds‑of‑milliseconds network latency in an Alibaba Cloud ACK cluster, and how targeted CPU binding and kernel patches resolved the issue.

Network Latencycgroupcloud-native
0 likes · 14 min read
Uncovering the Root Causes of ACK Cluster Network Latency: kubelet, softirq, and cgroup Insights
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 10, 2023 · Cloud Native

How KubeVela Enables Full‑Stack Declarative Observability for Cloud‑Native Apps

This article explores KubeVela’s full‑stack declarative observability framework, detailing cloud‑native monitoring challenges, the Prism Aggregated API approach, multi‑cluster configurations, and out‑of‑the‑box addons that let developers and platform engineers seamlessly integrate, customize, and scale metrics, logs, and dashboards across heterogeneous environments.

DeclarativeKubeVelaMulti-Cluster
0 likes · 21 min read
How KubeVela Enables Full‑Stack Declarative Observability for Cloud‑Native Apps
Huolala Tech
Huolala Tech
Mar 9, 2023 · Cloud Native

How SHANGFU Transforms Prometheus Management for Scalable Cloud‑Native Monitoring

This article explains Prometheus fundamentals, compares long‑term storage options, details Huolala's challenges with multiple Prometheus clusters, and introduces SHANGFU—a three‑module system that streamlines configuration, collection, and query handling to boost observability, performance, and reliability in cloud‑native environments.

Prometheuscloud-nativekubernetes
0 likes · 15 min read
How SHANGFU Transforms Prometheus Management for Scalable Cloud‑Native Monitoring
Alibaba Cloud Native
Alibaba Cloud Native
Mar 8, 2023 · Cloud Native

How OpenYurt v1.2 Simplifies Edge Kubernetes Installation in Five Steps

OpenYurt v1.2.0 streamlines edge‑native Kubernetes deployment by removing any modifications to native clusters, cutting the installation process from ten to five steps, and enabling seamless Prometheus monitoring through the new Raven VPN component while outlining future Helm‑based simplifications.

InstallationOpenYurtPrometheus
0 likes · 6 min read
How OpenYurt v1.2 Simplifies Edge Kubernetes Installation in Five Steps
政采云技术
政采云技术
Mar 7, 2023 · Cloud Native

Zero‑Base Automated Deployment Using Docker, Jenkins, and GitLab CI

This tutorial walks you through building a complete automated deployment pipeline from scratch, covering project setup on GitHub, Dockerized Tomcat and Jenkins containers, GitLab CI vs Jenkins comparison, Jenkins job configuration, webhook triggers, and shell scripting for continuous integration and delivery.

DevOpsDockerGitHub
0 likes · 11 min read
Zero‑Base Automated Deployment Using Docker, Jenkins, and GitLab CI
Ops Development Stories
Ops Development Stories
Mar 6, 2023 · Databases

How to Deploy and Use Bytebase for Database CI/CD on Kubernetes

This guide explains why traditional DBA tasks are tedious, introduces Bytebase as a reliable database CI/CD platform, and provides step‑by‑step instructions for deploying Bytebase and PostgreSQL on Kubernetes, configuring GitLab integration, managing users, instances, projects, environments, and performing schema changes and data operations.

BytebaseDatabase CI/CDDevOps
0 likes · 11 min read
How to Deploy and Use Bytebase for Database CI/CD on Kubernetes
DevOps Cloud Academy
DevOps Cloud Academy
Mar 6, 2023 · Cloud Native

In‑Place Pod Vertical Scaling: Mutable Resource Requests and Limits in Kubernetes

This proposal introduces in‑place vertical scaling for Pods by making PodSpec resources mutable, extending PodStatus with ResourcesAllocated, adding ResizePolicy and Resize fields, and updating the CRI UpdateContainerResources and ContainerStatus APIs to support live CPU and memory adjustments without restarting containers.

CRIPodVertical Scaling
0 likes · 22 min read
In‑Place Pod Vertical Scaling: Mutable Resource Requests and Limits in Kubernetes
Ops Development Stories
Ops Development Stories
Mar 3, 2023 · Cloud Native

Integrating Gitee with Zadig for Seamless Microservice CI/CD

This guide walks you through adding a Gitee code source to Zadig, configuring a microservice-demo project with Vue.js frontend and Golang backend, setting up services, builds, environments, workflows, and automatic triggers to achieve end‑to‑end continuous delivery on Kubernetes.

GiteeZadigci/cd
0 likes · 8 min read
Integrating Gitee with Zadig for Seamless Microservice CI/CD
Alibaba Cloud Native
Alibaba Cloud Native
Mar 2, 2023 · Cloud Native

Master Multi‑Cluster GitOps with ACK One and ArgoCD – A Step‑by‑Step Guide

This guide walks you through using ACK One’s GitOps capabilities to manage multi‑cluster Kubernetes deployments with ArgoCD, covering prerequisites, CLI commands, console operations, application version upgrades, rollbacks, user‑permission management, Applicationset for multi‑cluster scaling, and Image Updater integration for end‑to‑end CI/CD automation.

ACK OneArgoCDGitOps
0 likes · 18 min read
Master Multi‑Cluster GitOps with ACK One and ArgoCD – A Step‑by‑Step Guide
dbaplus Community
dbaplus Community
Feb 28, 2023 · Operations

How Container SRE at DeWu Boosts Reliability: Practices, Metrics, and Incident Playbooks

This article details DeWu's container SRE approach, covering SRE fundamentals, on‑call response, SLO/SLA design, change management, capacity planning, kernel‑parameter monitoring, security safeguards, and a real‑world incident analysis, providing actionable insights for building resilient cloud‑native services.

CapacityPlanningIncidentResponseReliability
0 likes · 24 min read
How Container SRE at DeWu Boosts Reliability: Practices, Metrics, and Incident Playbooks
ByteDance SYS Tech
ByteDance SYS Tech
Feb 28, 2023 · Cloud Native

How ByteDance’s ARES Boosts Cloud‑Native Resilience with Chaos Engineering

This article explains ByteDance’s end‑to‑end chaos engineering practice for cloud‑native environments, covering its background, principles, comparison with traditional testing, the evolution of its internal platforms, and a detailed look at the Application Resilience Enhancement Service (ARES) and its core features.

Fault InjectionObservabilityResilience
0 likes · 17 min read
How ByteDance’s ARES Boosts Cloud‑Native Resilience with Chaos Engineering