Tagged articles
4063 articles
Page 14 of 41
Yum! Tech Team
Yum! Tech Team
Jan 19, 2024 · Cloud Native

Lossless Scaling Strategies for High‑Concurrency Microservices

This article examines lossless scaling techniques for high‑concurrency microservice architectures, detailing the challenges of expansion and contraction, early scaling approaches, and advanced optimizations such as delayed registration, readiness probes, eager‑load Ribbon, cache preloading, health‑check strategies, and asynchronous consumer handling to ensure high availability, performance, and cost efficiency.

cloud-nativekuberneteslossless scaling
0 likes · 16 min read
Lossless Scaling Strategies for High‑Concurrency Microservices
JavaEdge
JavaEdge
Jan 18, 2024 · Databases

Master RedisInsight: Visualize, Manage, and Optimize Redis Seamlessly

This guide introduces RedisInsight, a powerful GUI for Redis, outlines its key features such as cluster support, visual data browsing, built‑in CLI, stream and log analysis, and provides step‑by‑step installation instructions for Linux, Kubernetes, and macOS, plus basic usage tips.

Database ManagementGUIRedis
0 likes · 10 min read
Master RedisInsight: Visualize, Manage, and Optimize Redis Seamlessly
Linux Code Review Hub
Linux Code Review Hub
Jan 18, 2024 · Cloud Native

How to Build Unified Observability for Apache APISIX with DeepFlow

This article walks through deploying Apache APISIX and DeepFlow in a Kubernetes cluster, configuring eBPF‑based AutoTracing and OpenTelemetry integration, enabling Prometheus metrics, accessing logs and continuous profiling, and visualizing unified observability data via Grafana dashboards.

APISIXDeepFlowPrometheus
0 likes · 16 min read
How to Build Unified Observability for Apache APISIX with DeepFlow
Java Backend Technology
Java Backend Technology
Jan 18, 2024 · Databases

Master RedisInsight: Install, Configure, and Use the Ultimate Redis GUI

This guide introduces RedisInsight, outlines its key features, provides step‑by‑step installation on a physical server and via Kubernetes, explains environment configuration and startup, and demonstrates basic usage for monitoring and managing Redis instances through its graphical interface.

Database ManagementGUIInstallation
0 likes · 7 min read
Master RedisInsight: Install, Configure, and Use the Ultimate Redis GUI
dbaplus Community
dbaplus Community
Jan 16, 2024 · Cloud Native

How to Achieve Zero‑Downtime Service Deployment with Spring Cloud and Kubernetes

This article examines why most incidents occur during application rollout, analyzes the Kubernetes pod lifecycle for both startup and shutdown, identifies common zero‑downtime challenges, and presents concrete strategies—including active notifications, adaptive waiting, delayed registration, and readiness probes—to ensure lossless service upgrades and rollbacks.

Zero Downtimekubernetesservice registry
0 likes · 11 min read
How to Achieve Zero‑Downtime Service Deployment with Spring Cloud and Kubernetes
Alibaba Cloud Native
Alibaba Cloud Native
Jan 16, 2024 · Cloud Native

What’s New in Koordinator v1.4.0? A Deep Dive into Mixed‑Workload Scheduling and Resource Optimizations

Koordinator v1.4.0 introduces mixed K8s/YARN workloads, NUMA‑aware scheduling, CPU‑normalization, enhanced ElasticQuota with tree structures and non‑preemptible pods, cold‑memory reporting, QoS for non‑containerized applications, and a suite of bug‑fixes and performance improvements for enterprise Kubernetes clusters.

CPU normalizationElasticQuotaKoordinator
0 likes · 24 min read
What’s New in Koordinator v1.4.0? A Deep Dive into Mixed‑Workload Scheduling and Resource Optimizations
Open Source Linux
Open Source Linux
Jan 15, 2024 · Cloud Native

Automate Kubernetes Local Storage and Backup with Carina and Velero

This guide explains why local storage remains essential in the cloud‑native era, outlines a step‑by‑step plan to set up a Kubernetes cluster, deploy Carina for automated local disk management, configure a test Nginx workload, install MinIO and Velero for backup, and finally perform backup and restore operations to verify data integrity.

CarinaCloud Native StorageLocal Disk
0 likes · 30 min read
Automate Kubernetes Local Storage and Backup with Carina and Velero
MaGe Linux Operations
MaGe Linux Operations
Jan 14, 2024 · Operations

Mastering DevOps Architecture: From CI/CD to Real-World Success Stories

This comprehensive guide delves into DevOps architecture, explaining core concepts like continuous integration, delivery, and deployment, showcasing essential tools such as Jenkins, Docker, Kubernetes, and GitLab CI, and illustrating best practices and real‑world case studies from Netflix and Etsy to help teams accelerate, automate, and improve software delivery.

Best PracticesCI/CDDevOps
0 likes · 20 min read
Mastering DevOps Architecture: From CI/CD to Real-World Success Stories
Alibaba Cloud Native
Alibaba Cloud Native
Jan 12, 2024 · Cloud Native

Unlock Second-Scale Elastic Scheduling with ACK Virtual Nodes

This article explains how to use Alibaba Cloud Container Service (ACK) virtual nodes and Elastic Container Instances (ECI) to achieve second‑scale elasticity, covering installation, ResourcePolicy configuration, zone‑aware scheduling, high‑availability setups, and performance results with concrete YAML examples.

ECIResourcePolicyelastic scheduling
0 likes · 12 min read
Unlock Second-Scale Elastic Scheduling with ACK Virtual Nodes
Beike Product & Technology
Beike Product & Technology
Jan 12, 2024 · Information Security

Understanding High‑Risk Kubernetes RBAC Permissions and a Graph‑Based Risk Identification System

This article examines how misconfigured Kubernetes RBAC permissions can lead to privilege escalation across clusters, presents a graph‑based model to represent users, roles, and authorities, and provides code examples and Cypher queries for detecting and visualizing high‑risk permission paths.

RBACgraphkubernetes
0 likes · 16 min read
Understanding High‑Risk Kubernetes RBAC Permissions and a Graph‑Based Risk Identification System
360 Smart Cloud
360 Smart Cloud
Jan 10, 2024 · Cloud Native

Mixed Workload Scheduling (混部) in Kubernetes: Challenges, Core Technologies, and Koordinator Enhancements

The article analyzes low CPU utilization in pure online Kubernetes clusters, introduces mixed‑workload (online + offline) scheduling to improve resource efficiency, explains core techniques, kernel QoS features, and details Koordinator‑based implementations such as node resource reservation and scheduling adjustments.

KoordinatorMixed WorkloadQoS
0 likes · 13 min read
Mixed Workload Scheduling (混部) in Kubernetes: Challenges, Core Technologies, and Koordinator Enhancements
Code Ape Tech Column
Code Ape Tech Column
Jan 8, 2024 · Cloud Native

Comprehensive Guide to Using Ctrip's Apollo Distributed Configuration Center with Spring Boot

This article provides an in‑depth tutorial on Apollo, Ctrip's open‑source distributed configuration center, covering its concepts, architecture, four management dimensions, client design, deployment in Kubernetes, and step‑by‑step Spring Boot integration with code examples and configuration details.

ApolloConfiguration CenterDistributed Config
0 likes · 28 min read
Comprehensive Guide to Using Ctrip's Apollo Distributed Configuration Center with Spring Boot
Java Tech Enthusiast
Java Tech Enthusiast
Jan 6, 2024 · Databases

RedisInsight: Introduction, Installation, and Basic Usage

RedisInsight is a GUI for Redis that monitors memory, connections, hit rate, and uptime, supports clusters, SSL/TLS, and key editing, and can be installed on Linux via a downloadable package with environment variables and launched as a service, or deployed on Kubernetes using a NodePort service and a Deployment of the redislabs/redisinsight image, after which the UI provides metrics, data editing, and memory analysis.

GUIInstallationRedis
0 likes · 5 min read
RedisInsight: Introduction, Installation, and Basic Usage
DevOps Coach
DevOps Coach
Jan 5, 2024 · Cloud Native

10 Hard‑Earned Kubernetes Lessons Every Engineer Should Know

Drawing from three years of managing Kubernetes clusters across Azure and AWS, the author shares ten practical lessons covering cloud‑native deployment, infrastructure as code, Helm chart usage, service mesh decisions, resource limits, stateless design, HPA configuration, and strategies for regular upgrades, aimed at both newcomers and seasoned practitioners.

Best PracticesDevOpscloud-native
0 likes · 7 min read
10 Hard‑Earned Kubernetes Lessons Every Engineer Should Know
MaGe Linux Operations
MaGe Linux Operations
Jan 4, 2024 · Cloud Native

How to Configure Kubernetes Pods to Use an HTTP Proxy

This guide explains why and how to set up HTTP/HTTPS proxy settings for Kubernetes pods in enterprise environments, covering use cases, two configuration methods (ConfigMap and direct environment variables), parameter details, testing procedures, and best practices for reliable outbound traffic.

ConfigMapProxycloud-native
0 likes · 6 min read
How to Configure Kubernetes Pods to Use an HTTP Proxy
ByteDance Cloud Native
ByteDance Cloud Native
Jan 4, 2024 · Cloud Native

Mastering Cloud‑Native Cost Governance: FinOps Strategies for Kubernetes

This article explains how enterprises can leverage cloud‑native architectures and FinOps practices to gain financial accountability, visualize multi‑dimensional cost data, optimize resource usage, and implement systematic cost governance across Kubernetes environments, covering cost insight, optimization, and operational stages with practical recommendations and example algorithms.

FinOpscost optimizationkubernetes
0 likes · 14 min read
Mastering Cloud‑Native Cost Governance: FinOps Strategies for Kubernetes
Efficient Ops
Efficient Ops
Jan 3, 2024 · Cloud Native

Master Kubernetes Basics: Understanding Pods, Nodes, and Cluster Resources

This article provides a concise, practical guide to Kubernetes fundamentals, covering pod creation, the essential compute‑network‑storage resources, cluster components, native objects like Deployments and StatefulSets, and the trade‑offs of standardization, elasticity, and extensibility.

ClusterDevOpsPod
0 likes · 15 min read
Master Kubernetes Basics: Understanding Pods, Nodes, and Cluster Resources
Architecture Development Notes
Architecture Development Notes
Jan 3, 2024 · Cloud Native

Build a Kubernetes Cluster with kubeadm: Step‑by‑Step Guide

This guide walks you through preparing Linux machines, configuring system settings, installing Docker and Kubernetes components with kubeadm, initializing a master node, deploying a pod network, joining worker nodes, and verifying the cluster, providing a complete step‑by‑step tutorial for building a Kubernetes cluster.

Cluster SetupDockerFlannel
0 likes · 7 min read
Build a Kubernetes Cluster with kubeadm: Step‑by‑Step Guide
MaGe Linux Operations
MaGe Linux Operations
Jan 1, 2024 · Cloud Native

Install Kubernetes v1.18.8 on CentOS: Ingress, Dashboard, Helm Guide

This step‑by‑step tutorial shows how to set up a Kubernetes v1.18.8 cluster on CentOS 8.5 running in Hyper‑V, configure static IPs, unique host and machine IDs, install Docker, kubeadm, kubelet and kubectl, deploy flannel networking, the Kubernetes Dashboard, Metrics Server, Helm, and an NGINX Ingress controller, and includes troubleshooting tips.

CentOSdashboardhelm
0 likes · 25 min read
Install Kubernetes v1.18.8 on CentOS: Ingress, Dashboard, Helm Guide
System Architect Go
System Architect Go
Dec 30, 2023 · Cloud Native

How External HTTP/HTTPS Requests Reach Containers in a Kubernetes Cluster

This article explains the end‑to‑end path that an external HTTP or HTTPS request follows—from the client through DNS resolution, load balancer, ingress controller, service routing, and finally to the target container inside a Kubernetes pod—while also covering optional variations and the underlying network components.

HTTPServiceingress
0 likes · 7 min read
How External HTTP/HTTPS Requests Reach Containers in a Kubernetes Cluster
Cloud Native Technology Community
Cloud Native Technology Community
Dec 29, 2023 · Cloud Native

Mastering Strimzi Kafka Operator: Architecture, Deployment & Tuning on K8s

This article provides an in‑depth analysis of the Strimzi Kafka Operator, covering its core features, multi‑layer architecture, detailed installation steps on Kubernetes, Kafka cluster creation, production/consumption workflows, and the internal reconciliation mechanisms that enable automated scaling, storage tuning, and fault‑recovery.

KafkaOperatorReconciliation
0 likes · 11 min read
Mastering Strimzi Kafka Operator: Architecture, Deployment & Tuning on K8s
Sohu Tech Products
Sohu Tech Products
Dec 27, 2023 · Big Data

Practical Implementation of Data Integration with Flink on Kubernetes at Li Auto

Li Auto built a cloud‑native data‑integration platform by deploying Flink on Kubernetes, unifying batch and streaming workloads with a storage layer (JuiceFS + BOS) and Flink Operator, enabling simple source‑sink pipelines, elastic scaling, automated checkpointing, and centralized monitoring while addressing earlier fragmentation and resource inefficiencies.

Big DataData IntegrationFlink
0 likes · 11 min read
Practical Implementation of Data Integration with Flink on Kubernetes at Li Auto
vivo Internet Technology
vivo Internet Technology
Dec 27, 2023 · Cloud Native

vivo Joins CNCF Cloud Native Computing Foundation: Cloud Native Adoption and Practice

vivo has officially joined the CNCF to accelerate its internal cloud‑native adoption—leveraging Kubernetes, Helm, Harbor and other open‑source tools to build a robust container platform, enhance business delivery, share knowledge with the global community, contribute to CNCF projects, and continue advancing micro‑services and containerization.

CNCFPlatform EngineeringVivo
0 likes · 6 min read
vivo Joins CNCF Cloud Native Computing Foundation: Cloud Native Adoption and Practice
DeWu Technology
DeWu Technology
Dec 27, 2023 · Cloud Native

DeWu's Cloud-Native Container Management Practices

Since August 2021, DeWu App has built a cloud‑native, multi‑cluster Kubernetes platform that uses an OAM‑style CloneSet model, Helm‑generated resources, Karmada‑based federation, custom scheduler plugins for reservation and node‑balance, offline mixing for Flink, a unified KubeAutoScaler, and a self‑built KubeAI stack, achieving significant cost cuts and improved stability while planning further middleware containerization and multi‑cloud expansion.

AIAutoscalingCost Management
0 likes · 22 min read
DeWu's Cloud-Native Container Management Practices
Efficient Ops
Efficient Ops
Dec 26, 2023 · Cloud Native

Master kubectl: Essential Commands for Managing Kubernetes Clusters

This comprehensive guide covers kubectl basics, autocomplete setup, context configuration, creating, viewing, updating, patching, editing, scaling, and deleting resources, as well as interacting with pods, nodes, and using the kubectl set family for resources, selectors, and images.

DevOpscloud-nativecommand-line
0 likes · 16 min read
Master kubectl: Essential Commands for Managing Kubernetes Clusters
System Architect Go
System Architect Go
Dec 23, 2023 · Cloud Native

What Happens Inside Kubernetes When You Create a Deployment?

This article walks through the complete Kubernetes workflow from a user‑submitted Deployment request to the creation and scheduling of the resulting Pod, detailing the roles of the control‑plane components, node services, admission webhooks, and the various plugins involved.

Control PlanePodScheduler
0 likes · 7 min read
What Happens Inside Kubernetes When You Create a Deployment?
Full-Stack DevOps & Kubernetes
Full-Stack DevOps & Kubernetes
Dec 23, 2023 · Cloud Native

Essential Kubernetes Security Practices to Safeguard Production Clusters

Learn the critical Kubernetes security measures for production environments, including RBAC access control, network policies, secret management, continuous monitoring, patch updates, API server hardening, Kubelet protection, pod security policies, and container hardening techniques, each illustrated with practical YAML examples and command snippets.

ContainerHardeningNetworkPolicyPodSecurityPolicy
0 likes · 10 min read
Essential Kubernetes Security Practices to Safeguard Production Clusters
MaGe Linux Operations
MaGe Linux Operations
Dec 22, 2023 · Operations

Evolving Jenkins Pipelines for Robust Deployments and Rolling Updates

This article walks through the evolution of a Jenkins pipeline to not only build and push Docker images but also to deploy to Kubernetes with health checks for Deployments and StatefulSets, incorporating rolling update strategies and comprehensive status verification to ensure reliable CI/CD workflows.

CI/CDDevOpsJenkins
0 likes · 16 min read
Evolving Jenkins Pipelines for Robust Deployments and Rolling Updates
DataFunTalk
DataFunTalk
Dec 22, 2023 · Big Data

Practical Implementation of Flink on Kubernetes for Data Integration at Li Auto

This article details Li Auto's end‑to‑end data integration practice using Flink on Kubernetes, covering the evolution of their integration platform, architectural design, cloud‑native deployment, operational challenges, and future roadmap, while highlighting unified batch‑stream processing and resource elasticity.

Batch processingBig DataData Integration
0 likes · 12 min read
Practical Implementation of Flink on Kubernetes for Data Integration at Li Auto
360 Quality & Efficiency
360 Quality & Efficiency
Dec 22, 2023 · Cloud Native

Refactoring Test Environment Deployment with Kubernetes: Practices and Pipeline Integration

This article explores the challenges of test environment deployment in modern DevOps, explains why Kubernetes offers a natural solution, details its design principles and core objects, and presents practical patterns for integrating Kubernetes‑based environments into CI/CD pipelines to achieve high cohesion, low coupling, and scalable testing workflows.

DevOpsci/cddeployment
0 likes · 16 min read
Refactoring Test Environment Deployment with Kubernetes: Practices and Pipeline Integration
MaGe Linux Operations
MaGe Linux Operations
Dec 21, 2023 · Cloud Native

Why Does a Kubernetes Pod Stay in Terminating State for 30 Seconds?

This article explains why a Kubernetes pod may remain in the Terminating state for up to 30 seconds, detailing the graceful shutdown process, the role of terminationGracePeriodSeconds and preStop hooks, and how to modify startup scripts to ensure prompt pod deletion.

Init ContainerPod TerminationSIGTERM
0 likes · 6 min read
Why Does a Kubernetes Pod Stay in Terminating State for 30 Seconds?
Open Source Linux
Open Source Linux
Dec 21, 2023 · Operations

Master Kubernetes Troubleshooting: 100 Essential kubectl Commands

This comprehensive guide presents 100 practical kubectl commands for diagnosing Kubernetes clusters, covering everything from cluster information and pod health checks to networking, storage, security, scaling, and advanced debugging tools, helping operators quickly identify and resolve issues.

Cluster Troubleshootingcommandsdiagnostics
0 likes · 20 min read
Master Kubernetes Troubleshooting: 100 Essential kubectl Commands
vivo Internet Technology
vivo Internet Technology
Dec 20, 2023 · Cloud Native

Resource Overcommit Strategies in Vivo Container Platform: Static and Dynamic Approaches

Vivo’s container platform combats oversized resource requests by first applying static coefficient‑based overcommit at deployment and then using a dynamic recommender that continuously gathers usage metrics, builds exponential histograms with a half‑life sliding‑window model, and adjusts CPU (and optionally memory) requests, improving packing efficiency, reducing billing, and boosting CPU utilization by up to eight percent while maintaining HPA accuracy.

Resource Overcommitdynamic overcommithpa
0 likes · 15 min read
Resource Overcommit Strategies in Vivo Container Platform: Static and Dynamic Approaches
AntTech
AntTech
Dec 18, 2023 · Cloud Native

AlterShield Open‑Source Change Risk Control Platform: Architecture, Features, and Future Roadmap

AlterShield is an open‑source change‑risk prevention solution originally built by Ant Group that provides lifecycle‑aware change defense, cloud‑native operator integration, KDE‑based anomaly detection, and extensible plug‑in frameworks, with detailed module descriptions, recent v1.0 releases, and a roadmap for advanced monitoring and noise‑reduction capabilities.

SREchange managementcloud-native
0 likes · 13 min read
AlterShield Open‑Source Change Risk Control Platform: Architecture, Features, and Future Roadmap
dbaplus Community
dbaplus Community
Dec 17, 2023 · Operations

Why Kubernetes Needs an LTS Release: Balancing Stability and Speed

The article examines the rapid Kubernetes upgrade cycle, the operational challenges it creates for teams, argues for a long‑term support (LTS) version, weighs pros and cons, and proposes compromise solutions to improve cluster stability without sacrificing innovation.

Cluster UpgradeLTSOperations
0 likes · 10 min read
Why Kubernetes Needs an LTS Release: Balancing Stability and Speed
Architecture Digest
Architecture Digest
Dec 15, 2023 · Operations

Diagnosing High CPU and Frequent GC in a Java Container: A Step‑by‑Step Analysis

When a production container suddenly hit over 90% CPU and excessive JVM garbage collection, the author walks through entering the pod, using top and top‑H to locate the offending thread, extracting its stack with jstack, downloading the data via a simple HTTP server, and ultimately discovering an Excel export routine that caused massive object allocation, fixing the code and restoring stability.

CPUJVMJava
0 likes · 6 min read
Diagnosing High CPU and Frequent GC in a Java Container: A Step‑by‑Step Analysis
Efficient Ops
Efficient Ops
Dec 13, 2023 · Cloud Native

How to Build Your Own Kubernetes‑Style Container Orchestration System

This article walks through the evolution from a single‑machine Java monolith to a distributed, container‑based platform, detailing master‑worker roles, core Kubernetes‑like components, networking, scheduling, and plug‑ins for a complete cloud‑native orchestration solution.

cloud-nativecontainer orchestrationetcd
0 likes · 8 min read
How to Build Your Own Kubernetes‑Style Container Orchestration System
vivo Internet Technology
vivo Internet Technology
Dec 13, 2023 · Artificial Intelligence

Practice of Multi-NIC Container Network Acceleration for Offline Training

The talk explains how Vivo leverages a Kubernetes‑based solution that combines Calico and RoCEv2 to migrate offline training workloads from single‑NIC to multi‑NIC, integrating loss‑less RDMA, planning topology and IP allocation, and employing Volcano, SpiderPool, Macvlan, and Multus CNI for efficient container networking.

Multi-NICRDMAcloud-native
0 likes · 4 min read
Practice of Multi-NIC Container Network Acceleration for Offline Training
MaGe Linux Operations
MaGe Linux Operations
Dec 13, 2023 · Cloud Native

Demystifying Kubernetes CRDs: Extending the Platform with Custom Resources

This article clarifies common misconceptions about Kubernetes CustomResourceDefinitions, explains the controller pattern, and demonstrates how CRDs enable custom controllers, versioned micro‑services, blue‑green deployments, and standardized management of application concepts within a Kubernetes cluster.

CRDControllersCustomResourceDefinition
0 likes · 7 min read
Demystifying Kubernetes CRDs: Extending the Platform with Custom Resources
DevOps Cloud Academy
DevOps Cloud Academy
Dec 11, 2023 · Operations

Managing Java Process Memory in Kubernetes Pods to Prevent OOMKilled

This article explains why Java processes in Kubernetes pods often encounter OOMKilled despite correct JVM heap settings, analyzes the discrepancy between JVM‑reported memory and container metrics, and provides practical steps such as adjusting MaxRAMPercentage and pod memory limits to stabilize memory usage.

JVMJavaOOMKilled
0 likes · 9 min read
Managing Java Process Memory in Kubernetes Pods to Prevent OOMKilled
Alibaba Cloud Native
Alibaba Cloud Native
Dec 11, 2023 · Cloud Native

Boosting Cluster Resource Utilization with Alibaba Cloud Native Elastic Solutions

This article explains how Alibaba Cloud's native elastic solutions—covering application‑level scaling, resource‑level scaling, and the new instant elastic controller—help enterprises improve Kubernetes cluster resource utilization, reduce costs, and simplify operations through advanced metrics, custom scaling policies, and event‑driven node management.

ACKCluster Autoscalercloud-native
0 likes · 18 min read
Boosting Cluster Resource Utilization with Alibaba Cloud Native Elastic Solutions
IT Services Circle
IT Services Circle
Dec 11, 2023 · Fundamentals

JetBrains IntelliJ IDEA 2023.3 Release Highlights

The IntelliJ IDEA 2023.3 update introduces the fully launched AI Assistant, complete Java 21 support, enhanced debugging tools, a floating toolbar, out‑of‑the‑box Kubernetes integration, numerous UI improvements, faster Gradle and Maven imports, and a host of new framework and technology features.

AI AssistantIDEIntelliJ IDEA
0 likes · 7 min read
JetBrains IntelliJ IDEA 2023.3 Release Highlights
Cloud Native Technology Community
Cloud Native Technology Community
Dec 11, 2023 · Cloud Native

Key New Features in Kubernetes v1.29: CEL‑based CRD Validation, NodePort Allocation, Sidecar Containers, PreStop Hook, Service Account Token Binding, and More

Kubernetes v1.29 introduces 49 major updates including GA of CEL‑based CRD validation, a stable static‑dynamic NodePort range, default‑enabled SidecarContainers, an Alpha PreStop sleep hook, tighter ServiceAccount token binding, GA resource metrics, component health SLIs, and several other GA features, all of which simplify cluster operation and improve security.

CELCRDNodePort
0 likes · 17 min read
Key New Features in Kubernetes v1.29: CEL‑based CRD Validation, NodePort Allocation, Sidecar Containers, PreStop Hook, Service Account Token Binding, and More
Efficient Ops
Efficient Ops
Dec 10, 2023 · Cloud Native

How to Build a Complete Kubernetes Monitoring Stack with Prometheus & Grafana

This guide walks through a full Kubernetes monitoring solution using cAdvisor, node_exporter, Prometheus, and Grafana, covering architecture, data collection, service discovery, deployment steps with DaemonSets, and detailed YAML configurations for a production‑ready observability stack.

GrafanaPrometheuscAdvisor
0 likes · 6 min read
How to Build a Complete Kubernetes Monitoring Stack with Prometheus & Grafana
IT Services Circle
IT Services Circle
Dec 10, 2023 · Databases

Should Production Databases Be Deployed in Docker/Kubernetes? A Critical Analysis

The article critically examines the drawbacks of running production databases inside Docker containers or Kubernetes, arguing that while containers excel for stateless services, they introduce reliability, performance, maintenance, and complexity challenges that make them unsuitable for critical stateful database workloads.

ContainersDatabaseDocker
0 likes · 20 min read
Should Production Databases Be Deployed in Docker/Kubernetes? A Critical Analysis
Alibaba Cloud Native
Alibaba Cloud Native
Dec 8, 2023 · Cloud Native

How Fluid’s Cloud‑Native Caching Supercharges AIGC Model Inference

The article examines the cost, performance, and efficiency challenges of large‑model inference, explains why Kubernetes is becoming the standard platform for AI workloads, and details how the Fluid project provides cloud‑native caching, elastic scaling, and automation to dramatically reduce startup latency and operating expenses.

AIAIGCCaching
0 likes · 17 min read
How Fluid’s Cloud‑Native Caching Supercharges AIGC Model Inference
Volcano Engine Developer Services
Volcano Engine Developer Services
Dec 6, 2023 · Cloud Native

Mastering Kubernetes Cluster Autoscaler: Real‑World Challenges & Solutions

This article explores how Volcano Engine's VKE leverages Kubernetes Cluster Autoscaler to achieve elastic scaling, detailing the component's core functions, a customer’s high‑throughput workload, four major scaling problems encountered, and practical recommendations to improve performance, reliability, and cost efficiency.

Cluster AutoscalerPerformance Optimizationkubernetes
0 likes · 18 min read
Mastering Kubernetes Cluster Autoscaler: Real‑World Challenges & Solutions
Su San Talks Tech
Su San Talks Tech
Dec 6, 2023 · Operations

What Went Wrong in Didi’s 12‑Hour Outage? Lessons on Kubernetes Upgrades and Cost‑Cutting

An in‑depth review of Didi’s 12‑hour P0 outage reveals how a mistaken Kubernetes version downgrade during an in‑place upgrade caused master node failure, discusses cluster isolation, upgrade strategies, and the role of cost‑cutting pressures, offering practical lessons for large‑scale operations.

Cluster UpgradeCost ManagementOperations
0 likes · 7 min read
What Went Wrong in Didi’s 12‑Hour Outage? Lessons on Kubernetes Upgrades and Cost‑Cutting
ITPUB
ITPUB
Dec 5, 2023 · Cloud Native

Prevent Massive K8s Outages: Scale, Redundancy, and Embrace Restarts

The article analyzes the November 27 Didi outage caused by an aggressive Kubernetes upgrade, then presents four engineering principles—controlling cluster size, eliminating single points of failure, treating restarts as normal, and decoupling data and control planes—to build more resilient cloud‑native systems.

Cluster UpgradeScalabilitycloud-native
0 likes · 13 min read
Prevent Massive K8s Outages: Scale, Redundancy, and Embrace Restarts
Liangxu Linux
Liangxu Linux
Dec 4, 2023 · Cloud Native

Running Business Containers as Non-Root: Practical Guide and Real-World Scripts

This article explains why running business containers without root privileges is essential for security, outlines the necessary background and risks, and provides detailed step‑by‑step methods, Dockerfile snippets, entrypoint scripts, and real‑world examples for MySQL, Redis, CoreDNS, Consul, and cAdvisor to achieve safe non‑root container deployments.

CoreDNSENTRYPOINTNon-root
0 likes · 16 min read
Running Business Containers as Non-Root: Practical Guide and Real-World Scripts
Efficient Ops
Efficient Ops
Dec 4, 2023 · Cloud Native

How Does a Kubernetes Pod Get Created? Step‑by‑Step Walkthrough

This article walks through the complete Kubernetes pod creation workflow, from submitting the YAML with kubectl to the API server, storing the definition in etcd, scheduling, kubelet orchestration, container runtime delegation, CNI networking, health probing, and endpoint setup for services.

CNIPod LifecycleService endpoint
0 likes · 3 min read
How Does a Kubernetes Pod Get Created? Step‑by‑Step Walkthrough

Exploring Container Federation: Multi‑Cluster Management with FOOT V3.5

This article examines the challenges of managing multiple Kubernetes clusters, outlines key business pain points, reviews open‑source federation solutions, and details the FOOT V3.5 platform’s architecture—including hub‑cluster design, push/pull registration, application policies, APISIX gateway integration, and Ceph‑based distributed storage—while also looking ahead to AI, edge, and security trends.

APISIXFOOT platformcontainer federation
0 likes · 18 min read
Exploring Container Federation: Multi‑Cluster Management with FOOT V3.5
DataFunTalk
DataFunTalk
Nov 30, 2023 · Big Data

Big Data Cloud‑Native Trends and Challenges Highlighted at the 2023 Yunqi Conference

The 2023 Yunqi Conference in Hangzhou showcased the latest advances in cloud computing and big‑data technologies, examined the evolution from big‑data 1.0 to 3.0, discussed the key difficulties of making big data cloud‑native, and presented a practical case study of MiHoYo’s cloud‑native transformation.

Alibaba CloudBig DataData Lake
0 likes · 12 min read
Big Data Cloud‑Native Trends and Challenges Highlighted at the 2023 Yunqi Conference
DevOps Cloud Academy
DevOps Cloud Academy
Nov 27, 2023 · Operations

Implementing a DevSecOps CI/CD Pipeline for Multi‑Language Applications with Jenkins

This article walks through building a comprehensive DevSecOps CI/CD pipeline in Jenkins that integrates source control, static analysis, vulnerability scanning, multi‑language builds, Docker image creation, Trivy security checks, Kubernetes deployment, and ZAP DAST testing to securely deliver applications across various runtimes.

DevSecOpsDockerJenkins
0 likes · 18 min read
Implementing a DevSecOps CI/CD Pipeline for Multi‑Language Applications with Jenkins
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Nov 27, 2023 · Cloud Native

Mixed-Workload Scheduling and Resource Utilization Optimization in Xiaohongshu's Cloud-Native Platform

Xiaohongshu’s cloud‑native platform adopted a four‑stage mixed‑workload scheduling strategy—reusing idle nodes, whole‑machine time‑sharing, normal mixed pools, and a unified scheduler (Tusker) that coordinates CPU, GPU and memory across Kubernetes and YARN—boosting average cluster CPU utilization from under 20 % to over 45 % and delivering millions of low‑cost core‑hours while preserving QoS for latency‑sensitive, mid, and batch jobs.

Big DataQoScloud-native
0 likes · 19 min read
Mixed-Workload Scheduling and Resource Utilization Optimization in Xiaohongshu's Cloud-Native Platform
Alibaba Cloud Native
Alibaba Cloud Native
Nov 22, 2023 · Cloud Native

Build a Sidecarless AI Application with Alibaba Cloud Service Mesh ASM – Step‑by‑Step Guide

This guide walks you through creating a sidecarless AI demo on Alibaba Cloud Service Mesh ASM, covering environment setup, multi‑model serving with KServe, PVC storage, InferenceService configuration, business service deployment, gateway and waypoint creation, traffic routing rules, and OIDC single sign‑on integration.

AIASMKServe
0 likes · 28 min read
Build a Sidecarless AI Application with Alibaba Cloud Service Mesh ASM – Step‑by‑Step Guide
StarRocks
StarRocks
Nov 22, 2023 · Big Data

How StarRocks’ Compute‑Storage Separation Cut Costs 46% and Boosted Performance

This article details a Chinese tech company's migration of its internal big‑data analytics platform to StarRocks’ compute‑storage separation architecture, describing the original multi‑component setup, the pain points encountered, the evaluation methodology, performance and cost benchmarks, operational optimizations, migration steps, and future roadmap.

Big DataCompute-Storage SeparationCost Reduction
0 likes · 17 min read
How StarRocks’ Compute‑Storage Separation Cut Costs 46% and Boosted Performance
Advanced AI Application Practice
Advanced AI Application Practice
Nov 22, 2023 · Cloud Native

Beyond Implementation Details: When Is Docker Really Slower Than Native Services?

The article examines whether Docker incurs performance penalties compared to native services, arguing that the decision to adopt containers and micro‑services must weigh business complexity, company stage, team size, infrastructure readiness, and operational costs rather than relying on generic hype.

ContainerizationDockerTechnology Selection
0 likes · 7 min read
Beyond Implementation Details: When Is Docker Really Slower Than Native Services?
Efficient Ops
Efficient Ops
Nov 21, 2023 · Cloud Native

How to Diagnose and Fix Common Kubernetes Pod Startup Failures

This guide explains why Kubernetes pods may fail to start—covering resource overcommit, memory/CPU limits, network, storage, code, and configuration issues—and provides a step‑by‑step troubleshooting workflow including cluster health checks, event logs, pod status, network connectivity, storage verification, container logs, DNS resolution, and best‑practice tips.

DNSPod troubleshootingcluster debugging
0 likes · 9 min read
How to Diagnose and Fix Common Kubernetes Pod Startup Failures
Alibaba Cloud Native
Alibaba Cloud Native
Nov 21, 2023 · Cloud Native

Mastering FinOps in Cloud‑Native Container Environments: Real‑World Practices and Cost‑Saving Strategies

This article presents a detailed case study of a leading AI‑driven quant investment firm that leveraged Alibaba Cloud ACK's FinOps suite to tackle planning, allocation, management, and optimization challenges in Kubernetes‑based container workloads, achieving up to 25% cost reduction and significantly higher resource utilization.

ContainerFinOpscost optimization
0 likes · 20 min read
Mastering FinOps in Cloud‑Native Container Environments: Real‑World Practices and Cost‑Saving Strategies
Ops Development Stories
Ops Development Stories
Nov 20, 2023 · Operations

How eBPF Powers Next‑Gen Observability and Fault Diagnosis in Kubernetes

At KubeCon China 2023, experts Liu Kai and Dong Shandong presented a three‑part deep dive into Kubernetes observability challenges, demonstrating how eBPF enables comprehensive data collection across all stack layers, seamless integration, and intelligent root‑cause analysis through dimension attribution, anomaly bounding, and fault‑tree methods.

Fault DiagnosisObservabilitycloud-native
0 likes · 20 min read
How eBPF Powers Next‑Gen Observability and Fault Diagnosis in Kubernetes
Alibaba Cloud Native
Alibaba Cloud Native
Nov 20, 2023 · Cloud Native

How Alibaba Cloud ACK Guarantees Kubernetes Cluster Stability at Massive Scale

This article explains the stability challenges of large‑scale Kubernetes clusters, outlines ACK's high‑availability architecture and component optimizations, and details product features such as Prometheus, AIOps and managed node pools that together ensure reliable, performant cloud‑native workloads.

ACKCluster stabilityhigh availability
0 likes · 16 min read
How Alibaba Cloud ACK Guarantees Kubernetes Cluster Stability at Massive Scale