Tagged articles
3128 articles
Page 5 of 32
Architecture Digest
Architecture Digest
Apr 29, 2025 · Cloud Native

Key Changes and New Features in Nacos 3.0 Release

Version 3.0 of Nacos introduces major updates—including JDK 17 and Spring Boot 3.4.1 support, enhanced Admin and Console APIs, default authentication, AI‑focused Model Content Protocol, unified namespaces, beta distributed lock and fuzzy listening features, and native xDS protocol support—aimed at improving cloud‑native service discovery and configuration management.

Cloud NativeNacosservice discovery
0 likes · 6 min read
Key Changes and New Features in Nacos 3.0 Release
BirdNest Tech Talk
BirdNest Tech Talk
Apr 29, 2025 · Cloud Native

How Docker Simplifies MCP Server Deployment for AI Agents

The article analyzes the challenges of manually deploying Model Context Protocol (MCP) servers for AI agents, compares them with Docker‑based deployment, and demonstrates step‑by‑step configurations, code snippets, and concrete benefits such as environment consistency, resource efficiency, and security.

AI agentsCloud NativeDeployment
0 likes · 7 min read
How Docker Simplifies MCP Server Deployment for AI Agents
macrozheng
macrozheng
Apr 28, 2025 · Cloud Native

Discover Nacos 3.0: AI‑Driven MCP, Distributed Locks, and Native xDS Support

Version 3.0 of Nacos upgrades to JDK 17 and Spring Boot 3.4.1, introduces AI‑focused MCP, enhanced Admin API with default authentication, unified namespaces, beta distributed lock and fuzzy listening features, and native xDS protocol support, while highlighting related open‑source SpringBoot + Vue e‑commerce projects.

AICloud NativeMicroservices
0 likes · 7 min read
Discover Nacos 3.0: AI‑Driven MCP, Distributed Locks, and Native xDS Support
Linux Ops Smart Journey
Linux Ops Smart Journey
Apr 27, 2025 · Cloud Native

Deploy Jenkins on Kubernetes with Helm: A Step‑by‑Step Guide

This tutorial walks you through using Helm to download the Jenkins chart, pull and retag required Docker images, configure a custom values file, install Jenkins on a Kubernetes cluster, verify the deployment, and understand the benefits of Helm for streamlined CI/CD automation.

Cloud NativeDeployment
0 likes · 8 min read
Deploy Jenkins on Kubernetes with Helm: A Step‑by‑Step Guide
Java Tech Enthusiast
Java Tech Enthusiast
Apr 27, 2025 · Cloud Native

Microsoft Forked an Open‑Source OCI Registry Project: Ethics, Licensing, and Community Impact

Microsoft’s unexpected fork of the open‑source OCI registry Spegel, originally created by developer Philip Laine, sparked debate over open‑source ethics and the limits of the MIT license, highlighting the challenges small maintainers face when corporations reuse code with minimal attribution and prompting calls for stronger licensing and recognition practices.

Cloud NativeCommunityEthics
0 likes · 11 min read
Microsoft Forked an Open‑Source OCI Registry Project: Ethics, Licensing, and Community Impact
Java Architecture Diary
Java Architecture Diary
Apr 27, 2025 · Cloud Native

What’s New in Nacos 3.0? Key Features, AI Integration, and Cloud‑Native Enhancements

Nacos 3.0 introduces major upgrades—including JDK 17 and Spring Boot 3.4.1 support, a new Admin API, default authentication, AI‑focused MCP, unified namespaces, beta distributed lock and fuzzy listening features, plus native xDS protocol support—positioning it as a powerful cloud‑native service discovery and configuration platform.

AI integrationCloud NativeNacos
0 likes · 6 min read
What’s New in Nacos 3.0? Key Features, AI Integration, and Cloud‑Native Enhancements
Su San Talks Tech
Su San Talks Tech
Apr 27, 2025 · Backend Development

Mastering Microservices: Advantages, Challenges, and Essential Design Patterns

This article explains what microservices are, outlines their key advantages such as scalability and resilience, details the inherent challenges like complexity and security, and introduces essential design patterns—including Database‑Per‑Service, API Gateway, BFF, CQRS, Event Sourcing, Saga, Sidecar, Circuit Breaker, ACL, and Aggregator—to help architects build robust, maintainable systems.

Backend ArchitectureCloud NativeMicroservices
0 likes · 23 min read
Mastering Microservices: Advantages, Challenges, and Essential Design Patterns
MaGe Linux Operations
MaGe Linux Operations
Apr 25, 2025 · Cloud Native

Essential Docker Commands Cheat Sheet: Quick Reference for Developers

This comprehensive guide presents over twenty essential Docker CLI commands, covering image management, container lifecycle, registry operations, and system cleanup, with clear syntax examples and practical use‑case snippets to help developers and DevOps engineers work efficiently with containers.

CLICloud NativeContainers
0 likes · 11 min read
Essential Docker Commands Cheat Sheet: Quick Reference for Developers
Baidu Geek Talk
Baidu Geek Talk
Apr 23, 2025 · Operations

Baidu SRE Digital Immunity System: Construction, Evolution, and Practice

Baidu’s SRE digital‑immune system, evolved into an AI‑powered intelligent immunity platform, quantifies and mitigates risk across thousands of services by integrating data‑driven monitoring, rule‑based detection, and large‑model GraphRAG knowledge mining, cutting degradation cases by ~40% and shifting operations from reactive troubleshooting to proactive, data‑centric quality assurance.

AICloud NativeDigital Immunity
0 likes · 14 min read
Baidu SRE Digital Immunity System: Construction, Evolution, and Practice
Go Programming World
Go Programming World
Apr 22, 2025 · Artificial Intelligence

Design and Implementation of an Enterprise‑Grade LLMOPS Platform (EasyAI)

This article presents a comprehensive overview of building an enterprise‑level LLMOPS platform—including concept definitions, the relationship between LLMOPS, MLOps and intelligent agent platforms, four development tiers, architecture layers, core technical concerns, deployment options, and the benefits of cloud‑native AI development.

AI PlatformCloud NativeDevOps
0 likes · 15 min read
Design and Implementation of an Enterprise‑Grade LLMOPS Platform (EasyAI)
IT Xianyu
IT Xianyu
Apr 21, 2025 · Cloud Native

Step-by-Step Guide to Setting Up a Kubernetes 1.19 Cluster on CentOS 7.9

This guide walks through preparing two CentOS 7.9 servers, installing Docker and Kubernetes 1.19 components, initializing a master node, joining a worker node, and validating the cluster with a sample Nginx deployment, including common troubleshooting tips.

CalicoCentOSCloud Native
0 likes · 10 min read
Step-by-Step Guide to Setting Up a Kubernetes 1.19 Cluster on CentOS 7.9
Pan Zhi's Tech Notes
Pan Zhi's Tech Notes
Apr 21, 2025 · Cloud Native

Build a Clean Microservice Config Center with Nacos in One Step

This article walks through using Nacos as a centralized configuration center for Spring Cloud microservices, showing how to create configuration data, set up a Maven client, enable dynamic refresh with @RefreshScope, and manage multi‑environment and multi‑file configurations.

@RefreshScopeCloud NativeConfiguration Center
0 likes · 16 min read
Build a Clean Microservice Config Center with Nacos in One Step
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Apr 18, 2025 · Operations

How Baidu’s AI‑Powered Digital Immune System Reinvents SRE Risk Management

This article explains why modern SRE teams need a digital immune system, describes Baidu’s data‑driven approach to improve system resilience, outlines the three‑phase evolution from digital transformation to AI‑enhanced risk mining, and shares concrete results and future directions for sustainable operations.

AICloud NativeDigital Immune System
0 likes · 15 min read
How Baidu’s AI‑Powered Digital Immune System Reinvents SRE Risk Management
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Apr 17, 2025 · Cloud Native

Kubernetes Architecture and Core Principles Explained

This article provides a comprehensive overview of Kubernetes, covering its cloud‑native architecture, core components such as API Server, Scheduler, Controller Manager, etcd, kubelet and kube‑proxy, and explains the workflow that enables automated deployment, scaling and management of containerized applications.

Cloud NativeDevOpsKubernetes
0 likes · 6 min read
Kubernetes Architecture and Core Principles Explained
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Apr 17, 2025 · Cloud Native

OpenKruise 1.8 Release Highlights: In‑Place VPA, StatefulSet Volume Expansion, AI WorkloadSpread, Serverless Probe, SidecarSet Gray‑Release, and Helm Pre‑Delete Hook

OpenKruise 1.8, the latest CNCF‑incubated cloud‑native automation suite, introduces in‑place vertical pod autoscaling, native StatefulSet volume expansion, AI‑aware WorkloadSpread, serverless probe support, sidecar gray‑release capabilities, and a Helm pre‑delete safety hook, all backed by detailed YAML examples and future roadmap.

Cloud NativeInPlaceVPAKubernetes
0 likes · 13 min read
OpenKruise 1.8 Release Highlights: In‑Place VPA, StatefulSet Volume Expansion, AI WorkloadSpread, Serverless Probe, SidecarSet Gray‑Release, and Helm Pre‑Delete Hook
dbaplus Community
dbaplus Community
Apr 16, 2025 · Backend Development

How Ctrip’s Kafka Gatekeeper Boosts FinOps Data Quality and Automates Cost Governance

This article explains how Ctrip’s hybrid‑cloud FinOps billing system uses a custom Kafka Gatekeeper to detect, locate, and automatically remediate data‑quality issues across dozens of self‑built PaaS services, improving coverage, timeliness, and responsibility attribution while supporting high‑availability deployments.

Cloud NativeFinOpsGatekeeper
0 likes · 19 min read
How Ctrip’s Kafka Gatekeeper Boosts FinOps Data Quality and Automates Cost Governance
Ops Development Stories
Ops Development Stories
Apr 15, 2025 · Cloud Native

Boost Kubernetes Management with AI: Introducing the Lightweight k8m Console

This article introduces k8m, a lightweight AI‑enhanced console for Kubernetes that simplifies cluster management, installation, configuration, and daily operations, while offering features such as YAML auto‑translation, AI‑driven event and log diagnostics, command generation, multi‑cluster support, and role‑based access control.

AICloud NativeDevOps
0 likes · 13 min read
Boost Kubernetes Management with AI: Introducing the Lightweight k8m Console
Ops Development Stories
Ops Development Stories
Apr 15, 2025 · Artificial Intelligence

Unlocking the AI USB‑C: Deep Dive into the Model Context Protocol (MCP)

This article explores the Model Context Protocol (MCP), the emerging “USB‑C” for AI, detailing its core advantages, implementation with Kubernetes, a six‑layer cloud‑native architecture, practical code examples, and developer guidelines for building AI‑powered, secure, and scalable services.

AICloud NativeDevOps
0 likes · 8 min read
Unlocking the AI USB‑C: Deep Dive into the Model Context Protocol (MCP)
Linux Kernel Journey
Linux Kernel Journey
Apr 15, 2025 · Operations

Efficiently Resolving Performance Bottlenecks and Jitter with Process Hotspot Tracing in Alibaba Cloud OS Console

The article explains how Alibaba Cloud's SysOM console uses low‑overhead process hotspot tracing, stack unwinding, symbol resolution, eBPF and AI diagnostics to pinpoint CPU, memory, lock and network issues, offering visual flame‑graph analysis and real‑world case studies for faster root‑cause identification.

AI diagnosticsCloud NativeSysOM
0 likes · 15 min read
Efficiently Resolving Performance Bottlenecks and Jitter with Process Hotspot Tracing in Alibaba Cloud OS Console
Ops Development & AI Practice
Ops Development & AI Practice
Apr 14, 2025 · Industry Insights

When a “Perfect” EKS Terraform Module Becomes a Debugging Nightmare

The author recounts the high hopes and subsequent frustrations of adopting the community‑maintained terraform‑aws‑eks module for AWS EKS, detailing hidden complexities, limited AI assistance, and practical lessons on embracing complexity, critical use of open‑source modules, and the importance of rest during tough debugging sessions.

AI CopilotCloud NativeDevOps
0 likes · 9 min read
When a “Perfect” EKS Terraform Module Becomes a Debugging Nightmare
Alibaba Cloud Observability
Alibaba Cloud Observability
Apr 14, 2025 · Cloud Native

How to Connect Grafana to Large Language Models with MCP (Model Context Protocol)

This guide shows how to use the Model Context Protocol (MCP) to build a lightweight server that links Grafana dashboards to large language models, covering MCP concepts, FastMCP setup, Python client implementation, environment preparation, and integration with Cherry Studio for seamless AI-driven data access.

AI integrationCloud NativeGrafana
0 likes · 12 min read
How to Connect Grafana to Large Language Models with MCP (Model Context Protocol)
Cloud Native Technology Community
Cloud Native Technology Community
Apr 11, 2025 · Cloud Native

How Kube-OVN Enables Seamless Live Migration for KubeVirt VMs

This article explains the challenges of live‑migrating KubeVirt virtual machines, how Kube‑OVN addresses network‑bridge limitations and IP changes, provides the required VM annotation, step‑by‑step migration commands, and details the multi‑stage migration process that keeps network interruption under 0.5 seconds with no TCP break.

Cloud NativeKube-OVNKubeVirt
0 likes · 7 min read
How Kube-OVN Enables Seamless Live Migration for KubeVirt VMs
21CTO
21CTO
Apr 9, 2025 · Operations

9 Must‑Have Container Monitoring Tools and Best Practices for Modern Cloud‑Native Environments

This article reviews nine practical container‑monitoring solutions—from Last9 and Prometheus to Dynatrace and Elastic Observability—detailing their key features, pricing, and why developers prefer them, and then offers comprehensive best‑practice guidance for metrics, tagging, alerts, and advanced observability strategies in Kubernetes‑driven cloud‑native deployments.

Cloud NativeDevOpsKubernetes
0 likes · 25 min read
9 Must‑Have Container Monitoring Tools and Best Practices for Modern Cloud‑Native Environments
Alibaba Cloud Native
Alibaba Cloud Native
Apr 6, 2025 · Cloud Native

How ZEEK’s Cloud‑Native Architecture Boosted App Stability and Agility

This article details ZEEK's cloud‑native transformation, covering the strategic shift to open‑source standards, unified microservice architecture, high‑availability practices, upgraded traffic gateways, visual data analysis, car‑network data collection, and AI‑assisted development, illustrating how these steps enhanced system stability, scalability, and development efficiency.

AIAutomotiveCloud Native
0 likes · 22 min read
How ZEEK’s Cloud‑Native Architecture Boosted App Stability and Agility
php Courses
php Courses
Mar 31, 2025 · Backend Development

PHP Ecosystem in 2025: New Language Features, Framework Trends, Design Patterns, and Emerging Applications

The 2025 PHP ecosystem overview details the language’s new features such as enhanced generics and fibers, performance improvements via JIT and OPcache, evolving best practices, the latest trends in major and micro frameworks, modern design pattern implementations, cloud‑native deployment, AI integration, and future directions.

Cloud NativeDesign PatternsPHP
0 likes · 17 min read
PHP Ecosystem in 2025: New Language Features, Framework Trends, Design Patterns, and Emerging Applications
FunTester
FunTester
Mar 30, 2025 · Cloud Native

Mastering Kubernetes Resources with Java: EndpointSlice, PVC, PV, NetworkPolicy & More

This guide shows how to use the Fabric8 Kubernetes Java client to load, create, apply, list, watch, and delete core Kubernetes objects such as EndpointSlice, PersistentVolumeClaim, PersistentVolume, NetworkPolicy, PodDisruptionBudget, and various RBAC resources, with complete code examples for each operation.

APICloud NativeDevOps
0 likes · 12 min read
Mastering Kubernetes Resources with Java: EndpointSlice, PVC, PV, NetworkPolicy & More
Ops Development & AI Practice
Ops Development & AI Practice
Mar 27, 2025 · Cloud Native

Master Kustomize: Simplify Kubernetes Configs with Generators and Transformers

Kustomize, built into kubectl, lets you declaratively manage Kubernetes YAML by organizing base resources, dynamically generating ConfigMaps and Secrets, applying transformers for environment‑specific tweaks, and optionally validating output, enabling a clean Base + Overlay workflow that reduces duplication and simplifies multi‑environment configuration.

Cloud NativeConfiguration ManagementDevOps
0 likes · 8 min read
Master Kustomize: Simplify Kubernetes Configs with Generators and Transformers
ITPUB
ITPUB
Mar 26, 2025 · Cloud Native

How KubeBlocks Enables Scalable, Automated Redis on Kubernetes at Kuaishou

This article details Kuaishou's migration of massive Redis clusters to Kubernetes using the KubeBlocks Operator, covering architecture, multi‑layer management requirements, federated cluster deployment, custom controllers, performance and stability considerations, and the resulting operational benefits.

Cloud NativeKubeBlocksKubernetes
0 likes · 15 min read
How KubeBlocks Enables Scalable, Automated Redis on Kubernetes at Kuaishou
Huolala Tech
Huolala Tech
Mar 25, 2025 · Backend Development

How Huolala Built a Scalable Distributed Load‑Testing Platform with JMeter

This article details Huolala's performance testing platform architecture, covering background challenges, a JMeter‑based solution, distributed agent design, unified logging, plugin management, data collection via Kafka, and future enhancements such as AI integration and improved file distribution, illustrating a comprehensive backend development effort.

Cloud NativeJMeterdistributed load testing
0 likes · 25 min read
How Huolala Built a Scalable Distributed Load‑Testing Platform with JMeter
FunTester
FunTester
Mar 25, 2025 · Operations

Integrating Chaos Engineering into Service Dependency Governance for Resilient Cloud‑Native Systems

This article explores how to embed chaos engineering practices into service dependency governance, detailing dynamic validation versus static analysis, fault injection techniques, multi‑point failure simulations, and data‑driven optimizations to build robust, self‑healing microservice architectures in cloud‑native environments.

Cloud NativeMicroservicesOperations
0 likes · 18 min read
Integrating Chaos Engineering into Service Dependency Governance for Resilient Cloud‑Native Systems
Ops Development Stories
Ops Development Stories
Mar 19, 2025 · Cloud Native

Unified Multi‑Cluster Monitoring with KubeDoor 1.0: Alerts, Metrics & Best Practices

KubeDoor 1.0 introduces a new architecture for unified multi‑Kubernetes monitoring, offering components for master and agent, flexible deployment options, Helm‑based installation, configurable storage and alerting settings, and detailed guidance on integrating with existing Prometheus/VictoriaMetrics setups while providing automatic peak‑usage data collection.

ClickHouseCloud NativeKubernetes
0 likes · 14 min read
Unified Multi‑Cluster Monitoring with KubeDoor 1.0: Alerts, Metrics & Best Practices
Tencent Cloud Developer
Tencent Cloud Developer
Mar 19, 2025 · Cloud Native

Kubernetes Monitoring: Why It’s Needed, Core Components, and Metric Exposure

Monitoring Kubernetes is essential to detect resource contention, component failures, and network issues; it involves tracking core component metrics such as API server latency, etcd write times, scheduler delays, as well as node‑level CPU, memory, disk, and network statistics, pod health, and custom application metrics exposed via Prometheus exporters for comprehensive observability.

Cloud NativeExportersKubernetes
0 likes · 23 min read
Kubernetes Monitoring: Why It’s Needed, Core Components, and Metric Exposure
Su San Talks Tech
Su San Talks Tech
Mar 19, 2025 · Operations

10 Proven Strategies to Achieve 99.99% System Availability

This article presents ten practical techniques—including redundant deployment, circuit breaking, traffic shaping, auto‑scaling, gray releases, downgrade switches, full‑link stress testing, data sharding, chaos engineering, and three‑layer monitoring—to dramatically improve system high‑availability from 99% to 99.99% in production environments.

Cloud NativeMicroservicesSystem Design
0 likes · 12 min read
10 Proven Strategies to Achieve 99.99% System Availability
Python Programming Learning Circle
Python Programming Learning Circle
Mar 18, 2025 · Cloud Native

Automating Kubernetes Operations with the Python Client

This article demonstrates how to use the Python Kubernetes client to programmatically restart deployments, scale them, execute commands inside pods, apply node taints, retrieve cluster metrics, and convert between YAML/JSON and client objects, providing practical code examples for cloud‑native automation.

APICloud NativeDevOps
0 likes · 8 min read
Automating Kubernetes Operations with the Python Client
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 18, 2025 · Cloud Native

Gray Release of LoRA and Base Models Using ACK Gateway with AI Extension on Kubernetes

This guide explains how to deploy large language model inference services on a GPU-enabled Kubernetes cluster, configure ACK Gateway with AI Extension for intelligent routing and load balancing, and perform gray releases for both LoRA fine‑tuned models and base models such as QwQ‑32B and DeepSeek‑R1, including step‑by‑step commands and validation procedures.

ACK GatewayAI inferenceCloud Native
0 likes · 25 min read
Gray Release of LoRA and Base Models Using ACK Gateway with AI Extension on Kubernetes
MaGe Linux Operations
MaGe Linux Operations
Mar 18, 2025 · Cloud Native

How to Deploy a Kubernetes v1.28.8 Cluster with KubeKey on Ubuntu

This guide walks through configuring three Ubuntu servers, installing KubeKey, creating a Kubernetes v1.28.8 cluster with HAProxy load balancing, deploying a sample nginx workload, and verifying the installation using kubectl and curl, providing all necessary commands and configuration details for a successful deployment.

Cloud NativeKubekeyKubernetes
0 likes · 13 min read
How to Deploy a Kubernetes v1.28.8 Cluster with KubeKey on Ubuntu
Alibaba Cloud Observability
Alibaba Cloud Observability
Mar 17, 2025 · Cloud Native

How to Master LLM Observability in Cloud‑Native Environments

This article explains the unique observability challenges of large language model (LLM) applications, outlines essential performance, cost, and safety metrics, and presents a comprehensive cloud‑native solution—including trace, metric, and log collection, domain‑specific dashboards, and step‑by‑step integration with Alibaba Cloud's Python Agent—to ensure reliable, efficient LLM deployments.

AI gatewayCloud NativeLLM Observability
0 likes · 18 min read
How to Master LLM Observability in Cloud‑Native Environments
Python Programming Learning Circle
Python Programming Learning Circle
Mar 17, 2025 · Cloud Native

Automating Kubernetes Tasks with the Python Client Library

This tutorial demonstrates how to set up a local KinD cluster, configure authentication, use raw curl commands, and employ the official Kubernetes Python client to list pods, create deployments, watch events, and manage RBAC, providing a complete guide for automating Kubernetes operations with Python.

APICloud NativeDevOps
0 likes · 11 min read
Automating Kubernetes Tasks with the Python Client Library
Ops Development & AI Practice
Ops Development & AI Practice
Mar 16, 2025 · Cloud Native

Why Quarkus Is Revolutionizing Cloud‑Native Java Development

Quarkus, a Kubernetes‑native Java framework built for GraalVM and HotSpot, delivers millisecond startup, low memory usage, developer‑friendly features, and seamless integration with cloud‑native platforms, making it ideal for microservices, serverless, and modern cloud applications.

Cloud NativeFast StartupJava
0 likes · 7 min read
Why Quarkus Is Revolutionizing Cloud‑Native Java Development
MaGe Linux Operations
MaGe Linux Operations
Mar 15, 2025 · Cloud Native

How MetalLB Transforms Load Balancing for Bare‑Metal Kubernetes Clusters

This guide explains Kubernetes Service types, the role of MetalLB in providing LoadBalancer functionality for bare‑metal clusters, step‑by‑step installation, configuration of address pools, testing with a sample service, integration with Ingress, and an overview of the Calico network plugin for pod isolation.

CalicoCloud NativeKubernetes
0 likes · 14 min read
How MetalLB Transforms Load Balancing for Bare‑Metal Kubernetes Clusters
JakartaEE China Community
JakartaEE China Community
Mar 15, 2025 · Backend Development

Key Jakarta EE Q&A: Naming, Governance, Roadmap, and How to Contribute

This article provides a comprehensive Q&A covering Jakarta EE’s definition, naming origin, platform scope, namespace shift, governance model, specification process, release cadence, future roadmap, relationship with EE4J, microservice and cloud‑native support, trademark usage, and step‑by‑step guidance on becoming a contributor or member.

Cloud NativeEclipse FoundationEnterprise Java
0 likes · 12 min read
Key Jakarta EE Q&A: Naming, Governance, Roadmap, and How to Contribute
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 13, 2025 · Artificial Intelligence

How to Master LLM Observability: End-to-End Monitoring with Alibaba Cloud

This article outlines Alibaba Cloud’s comprehensive LLM observability solution, covering challenges, key metrics, component architecture, data collection, tracing, performance analysis, and practical integration steps—including Python agent setup and Dify demo—to help developers monitor and optimize large language model applications.

AI MonitoringCloud NativeLLM Observability
0 likes · 19 min read
How to Master LLM Observability: End-to-End Monitoring with Alibaba Cloud
Sohu Tech Products
Sohu Tech Products
Mar 12, 2025 · Cloud Native

Argo Workflows: Container-Native Workflow Engine for Kubernetes

Argo Workflows is an open‑source, container‑native engine that runs on Kubernetes via Custom Resource Definitions, letting users declaratively define complex, step‑or DAG‑based pipelines—including CI/CD, data processing, and machine‑learning jobs—through reusable templates, with a server UI, controller, and pod architecture monitored by Prometheus.

Argo WorkflowsCNCFCloud Native
0 likes · 16 min read
Argo Workflows: Container-Native Workflow Engine for Kubernetes
Ops Development Stories
Ops Development Stories
Mar 10, 2025 · Cloud Native

What Are Kubernetes Core Components and How Do They Work?

This article provides a comprehensive overview of Kubernetes fundamentals, covering core control‑plane and node components, key object differences such as Pod vs Deployment, Service types, ConfigMap vs Secret, scheduling, health checks, scaling, security, storage, and troubleshooting techniques.

Cloud NativeContainersDeployment
0 likes · 19 min read
What Are Kubernetes Core Components and How Do They Work?
Ops Development & AI Practice
Ops Development & AI Practice
Mar 7, 2025 · Cloud Native

Mastering Kubernetes StatefulSets: How to Run Stateful Apps Reliably

This article explains Kubernetes StatefulSets, covering their core concepts, guarantees such as stable network IDs and persistent storage, the controller’s components, deployment workflow, typical use cases, best‑practice recommendations, and a detailed comparison with Deployments to help you manage stateful workloads effectively.

Cloud NativeDeploymentKubernetes
0 likes · 8 min read
Mastering Kubernetes StatefulSets: How to Run Stateful Apps Reliably
Alibaba Cloud Native
Alibaba Cloud Native
Mar 7, 2025 · Artificial Intelligence

8 Real-World AI Gateway Use Cases Every Enterprise Should Know

This article outlines eight practical AI gateway scenarios—from multi‑model services and consumer authentication to token rate limiting, content safety, semantic caching, and observability—explaining the business needs behind each and how Alibaba Cloud's cloud‑native API gateway provides concrete technical solutions.

AI gatewayCloud NativeContent Safety
0 likes · 15 min read
8 Real-World AI Gateway Use Cases Every Enterprise Should Know
ITPUB
ITPUB
Mar 6, 2025 · Cloud Native

Mastering Portainer: Simplify Docker and Kubernetes Management with Easy Deployment

This guide explains what Portainer is, compares its Community and Business editions, details its core architecture, provides step‑by‑step installation using Docker, Docker‑Compose, and Docker‑Stack, and demonstrates key features such as dashboards, container, image, service, volume, and user management for Docker and Kubernetes environments.

Cloud NativeContainer ManagementDocker
0 likes · 43 min read
Mastering Portainer: Simplify Docker and Kubernetes Management with Easy Deployment
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 6, 2025 · Big Data

Leveraging Apache Iceberg and AutoMQ for Real-Time Data Lake Ingestion: Architecture, Best Practices, and Cost Optimization

This article examines how Apache Iceberg’s snapshot‑based ACID transactions, logical‑physical partition evolution, and COW/MOR update modes enable efficient real‑time data lake ingestion, and demonstrates AutoMQ’s Kafka‑to‑Iceberg Table Topic solution that simplifies schema management, reduces latency, and cuts operational costs.

Apache IcebergAutoMQBig Data
0 likes · 14 min read
Leveraging Apache Iceberg and AutoMQ for Real-Time Data Lake Ingestion: Architecture, Best Practices, and Cost Optimization
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 5, 2025 · Cloud Native

Using Fluid Cloud‑Native Data Caching to Boost Performance and Elasticity of a Quantitative Research Platform on Alibaba Cloud

This article describes how JoinQuant built a cloud‑native quantitative research platform on Alibaba Cloud, identified performance, cost, data‑management, and security challenges, and solved them with Fluid’s JindoRuntime data‑caching, elastic scaling, and Python‑driven workflows, achieving dramatic speed and cost improvements.

Cloud NativeData CachingFluid
0 likes · 18 min read
Using Fluid Cloud‑Native Data Caching to Boost Performance and Elasticity of a Quantitative Research Platform on Alibaba Cloud
Practical DevOps Architecture
Practical DevOps Architecture
Mar 5, 2025 · Cloud Native

Kubernetes DNS Resolution Issues and Troubleshooting Guide

This guide explains common Kubernetes DNS problems—including failure to resolve external domains, inter‑pod service discovery addresses, and related impacts on applications like Nginx reverse proxies—and provides step‑by‑step troubleshooting procedures such as checking CoreDNS, inspecting resolv.conf, and customizing dnsPolicy and dnsConfig in pod specifications.

Cloud NativeCoreDNSDNS
0 likes · 6 min read
Kubernetes DNS Resolution Issues and Troubleshooting Guide
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 4, 2025 · Cloud Native

Koordinator v1.6 Release: Advanced Heterogeneous Device Scheduling and GPU Management Features

The Koordinator v1.6 release introduces a suite of innovations—including GPU topology‑aware scheduling, end‑to‑end GPU & RDMA joint allocation, strong GPU isolation, differentiated GPU scoring, fine‑grained resource reservation, mixed‑workload QoS, and extensive scheduler and rescheduler optimizations—to efficiently manage heterogeneous resources in Kubernetes clusters for AI and high‑performance computing workloads.

Cloud NativeGPU schedulingHeterogeneous Resources
0 likes · 24 min read
Koordinator v1.6 Release: Advanced Heterogeneous Device Scheduling and GPU Management Features
DataFunSummit
DataFunSummit
Mar 1, 2025 · Databases

Innovations and Breakthroughs of ClickHouse in Real‑Time OLAP

This article introduces ClickHouse as an open‑source column‑store OLAP database, outlines its core features, explains its distributed and cloud‑native architectures—including SharedMergeTree for serverless operation—presents benchmark results, compares community and enterprise editions, and answers common questions about its future direction.

ClickHouseCloud NativePerformance
0 likes · 15 min read
Innovations and Breakthroughs of ClickHouse in Real‑Time OLAP
Pan Zhi's Tech Notes
Pan Zhi's Tech Notes
Feb 28, 2025 · Cloud Native

Spring Cloud Quick‑Start Guide: Getting Started with Microservices

This article introduces Spring Cloud's background, core components (including first‑ and second‑generation modules derived from Netflix OSS), versioning scheme, compatibility with Spring Boot, and practical advice for selecting matching releases to avoid runtime issues in microservice projects.

Cloud NativeMicroservicesNetflix OSS
0 likes · 12 min read
Spring Cloud Quick‑Start Guide: Getting Started with Microservices
Ops Development & AI Practice
Ops Development & AI Practice
Feb 27, 2025 · Cloud Native

Boost Kubernetes Efficiency with Offline‑Online Hybrid Deployment

This article explains how to combine online services and offline tasks within a single Kubernetes cluster using offline‑online hybrid deployment, detailing its benefits such as cost savings and higher resource utilization, and walks through practical implementation methods like CronJobs, HPA, priority classes, node affinity, custom schedulers, and the open‑source Koordinator project, while also addressing associated challenges.

Cloud NativeKubernetesOffline Tasks
0 likes · 6 min read
Boost Kubernetes Efficiency with Offline‑Online Hybrid Deployment
dbaplus Community
dbaplus Community
Feb 25, 2025 · Cloud Native

Why We Dropped Kubernetes and Boosted DevOps Happiness by 89%

A DevOps team managing 47 Kubernetes clusters across three clouds faced burnout, high costs, and operational chaos, so they gradually replaced Kubernetes with simpler AWS services, cutting infrastructure spend by 58%, speeding deployments by 89%, and dramatically improving team morale and reliability.

Cloud NativeDevOpsInfrastructure Management
0 likes · 9 min read
Why We Dropped Kubernetes and Boosted DevOps Happiness by 89%
Cloud Native Technology Community
Cloud Native Technology Community
Feb 25, 2025 · Cloud Native

Understanding k8gb: A Kubernetes Global Load Balancer for Multi‑Cluster Deployments

This article explains the theory and practical usage of k8gb, a Kubernetes Global Balancer that provides DNS‑based load balancing, fault‑tolerant traffic routing, and seamless failover across multiple clusters to improve resilience, latency, and compliance in cloud‑native environments.

Cloud NativeGlobal Load BalancingKubernetes
0 likes · 8 min read
Understanding k8gb: A Kubernetes Global Load Balancer for Multi‑Cluster Deployments
Tencent Cloud Developer
Tencent Cloud Developer
Feb 25, 2025 · Artificial Intelligence

Deploy DeepSeek AI: Cloud, Local, API – Full Step‑by‑Step Guide

This guide walks developers through the full lifecycle of using DeepSeek—choosing the right deployment method (API, local machine, or private cloud), selecting model sizes based on hardware, configuring Tencent Cloud services, building AI applications, and integrating the model into development tools and mini‑programs.

AI application developmentAI model deploymentCloud Native
0 likes · 12 min read
Deploy DeepSeek AI: Cloud, Local, API – Full Step‑by‑Step Guide
FunTester
FunTester
Feb 24, 2025 · Cloud Native

Master Kubernetes with Fabric8 Java Client: Quick Guide & Advanced Tips

This article introduces the Fabric8 KubernetesClient for Java, explains why it outperforms the official client, shows how to add the Maven dependency, and provides step‑by‑step code examples for listing, creating, deleting, and watching Pods, as well as advanced operations on ConfigMaps, Deployments, and custom resources, illustrating real‑world use cases such as log collection, self‑healing, and dynamic scaling.

Cloud NativeDevOpsFabric8
0 likes · 8 min read
Master Kubernetes with Fabric8 Java Client: Quick Guide & Advanced Tips
Alibaba Cloud Native
Alibaba Cloud Native
Feb 22, 2025 · Artificial Intelligence

Boost Your Development with Alibaba Cloud’s Tongyi Lingma AI Coding Assistant – A Hands‑On Guide

This guide walks developers through installing the Tongyi Lingma AI coding assistant plugin, switching between large language models, using smart Q&A, terminal integration, code completion, bug‑fix suggestions, and multi‑file refactoring, showcasing how the tool streamlines everyday development tasks.

AI coding assistantCloud NativeIDE plugin
0 likes · 8 min read
Boost Your Development with Alibaba Cloud’s Tongyi Lingma AI Coding Assistant – A Hands‑On Guide
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Feb 20, 2025 · Cloud Native

TrafficRoute GTM: GEO‑Based Routing and Traffic Orchestration at ByteDance

This article explains how ByteDance’s TrafficRoute GTM, a DNS‑based global traffic routing service, uses GEO‑based routing, health‑check orchestration, and intelligent load‑balancing to achieve high stability, performance, and cost efficiency for ultra‑large‑scale traffic across multiple regions and CDN providers.

ByteDanceCloud NativeDNS Load Balancing
0 likes · 11 min read
TrafficRoute GTM: GEO‑Based Routing and Traffic Orchestration at ByteDance
Architecture Development Notes
Architecture Development Notes
Feb 19, 2025 · Operations

Avoid Prometheus Label Pitfalls: Best Practices for Scalable Monitoring

This article examines common label misuse in Prometheus, explains why adding global labels to every metric can cause data bloat, configuration rigidity, and dimensional pollution, and provides concrete best‑practice patterns, dynamic injection techniques, and governance rules to keep monitoring systems efficient and maintainable.

Best PracticesCloud NativeLabels
0 likes · 7 min read
Avoid Prometheus Label Pitfalls: Best Practices for Scalable Monitoring
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Feb 17, 2025 · Cloud Native

Multi‑Cluster Delivery with ACK One GitOps: A Case Study at Wondershare Technology

Wondershare Technology adopted Alibaba Cloud's ACK One GitOps platform to automate and unify the deployment of dozens of Kubernetes clusters across multiple regions, addressing manual deployment inefficiencies, traceability, rollback challenges, and multi‑tenant permission management while achieving a 50% increase in release efficiency.

Argo CDCloud NativeGitOps
0 likes · 7 min read
Multi‑Cluster Delivery with ACK One GitOps: A Case Study at Wondershare Technology
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Feb 17, 2025 · Cloud Native

Optimizing Offline Pod Scheduling with Koordinator and Yarn-Operator

To reduce resource contention and improve offline task reliability, this article examines the challenges of using Koordinator with Hadoop Yarn pods on Kubernetes, proposes real‑time resource reporting and task‑level eviction strategies, details community and custom solutions, and outlines future enhancements with Volcano.

Big DataCloud NativeKoordinator
0 likes · 9 min read
Optimizing Offline Pod Scheduling with Koordinator and Yarn-Operator
DataFunSummit
DataFunSummit
Feb 16, 2025 · Big Data

Bilibili Big Data Task Migration to Cloud‑Native Kubernetes Using Volcano Scheduler

This article shares Bilibili’s experience migrating its offline big‑data workloads to a cloud‑native Kubernetes environment using the Volcano scheduler, covering migration background, scheduler adaptation, hierarchical queue implementation, over‑commit framework (Amiyad), and future work to improve performance and resource utilization.

Cloud NativeKubernetesResource Overcommit
0 likes · 15 min read
Bilibili Big Data Task Migration to Cloud‑Native Kubernetes Using Volcano Scheduler
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Feb 14, 2025 · Cloud Native

Blue‑Green Deployment with Kruise Rollouts: Concepts, Implementation, and Comparison

This article explains the blue‑green deployment strategy, introduces Kruise Rollouts’ blue‑green capabilities, provides a step‑by‑step Kubernetes example with YAML manifests and kubectl commands, compares it to Argo Rollouts and Flux Flagger, discusses resource considerations and serverless advantages, and concludes with best‑practice recommendations.

Blue‑Green deploymentCloud NativeDevOps
0 likes · 16 min read
Blue‑Green Deployment with Kruise Rollouts: Concepts, Implementation, and Comparison
Ops Development Stories
Ops Development Stories
Feb 13, 2025 · Cloud Native

KubeDoor: AI‑Driven Kubernetes Load‑Aware Scheduling & Capacity Management

KubeDoor is an open‑source platform built with Python and Vue that leverages Kubernetes admission control, AI recommendations, and expert experience to provide load‑aware scheduling, capacity governance, real‑time resource analytics, and automated scaling for microservices, featuring a web UI, Grafana dashboards, and extensible control mechanisms.

AI schedulingAdmission ControllerCloud Native
0 likes · 11 min read
KubeDoor: AI‑Driven Kubernetes Load‑Aware Scheduling & Capacity Management
FunTester
FunTester
Feb 13, 2025 · Operations

Why Fault Testing Is Critical for Modern Online Systems

In today's digital era, online services face increasing fault risks, and systematic fault testing—through chaos engineering, fault injection, stress testing, and disaster recovery drills—helps teams anticipate, evaluate, and improve system resilience, ultimately reducing downtime and protecting business continuity.

Cloud NativeOperationsautomation
0 likes · 9 min read
Why Fault Testing Is Critical for Modern Online Systems
Alibaba Cloud Observability
Alibaba Cloud Observability
Feb 11, 2025 · Operations

Alibaba Cloud’s Compile‑Time Go Instrumentation: A New Era for Cloud‑Native Observability

Amid the surge of cloud‑native architectures, Alibaba Cloud showcases its open‑source, compile‑time Go instrumentation that delivers non‑intrusive monitoring, richer data, and cross‑vendor standards via OpenTelemetry, while highlighting extensive community contributions and collaborations that position it as a leading force in modern observability.

Alibaba CloudCloud NativeGo
0 likes · 6 min read
Alibaba Cloud’s Compile‑Time Go Instrumentation: A New Era for Cloud‑Native Observability
Practical DevOps Architecture
Practical DevOps Architecture
Feb 11, 2025 · Operations

Kubernetes Operations and Cloud Native Architecture Training Course

This comprehensive training program for intermediate to advanced users covers Kubernetes high‑availability deployment, elastic scaling, Helm package management, Ceph distributed storage integration, microservice container migration, Jenkins‑based CI/CD pipelines, and Istio service‑mesh governance, providing hands‑on labs, detailed chapters, and practical resources for mastering modern cloud‑native operations.

CephCloud NativeDevOps
0 likes · 7 min read
Kubernetes Operations and Cloud Native Architecture Training Course
Java Tech Enthusiast
Java Tech Enthusiast
Feb 8, 2025 · Cloud Native

Bun 1.2 Release: Enhanced Node.js Compatibility, Built-in Database & Cloud-Native Features

Bun 1.2 delivers its biggest upgrade yet, boosting Node.js compatibility above 90% for core modules, adding built‑in PostgreSQL and native S3 support that outperforms the AWS SDK, switching to a readable lock file for faster installs, enhancing testing tools, and improving HTTP/2, filesystem, JSON and Windows performance while targeting remaining compatibility gaps.

BunCloud NativeJavaScript runtime
0 likes · 5 min read
Bun 1.2 Release: Enhanced Node.js Compatibility, Built-in Database & Cloud-Native Features
Tencent Cloud Developer
Tencent Cloud Developer
Feb 7, 2025 · Artificial Intelligence

Launch DeepSeek Models in Seconds with One‑Click Cloud Development

This guide shows how to start DeepSeek large‑language models on cnb.cool in just 5‑10 seconds without downloading, using a simple three‑step process that includes forking the repository, selecting a model branch, and running Ollama or Docker commands, plus options for long‑term cloud deployment.

AICloud NativeDeepSeek
0 likes · 3 min read
Launch DeepSeek Models in Seconds with One‑Click Cloud Development
Alibaba Cloud Native
Alibaba Cloud Native
Feb 7, 2025 · Information Security

How DeepSeek’s Attack Highlights the Need for Robust Cloud‑Native Security Observability

The article examines DeepSeek’s rapid rise, the large‑scale malicious attacks it suffered, and then provides a detailed, cloud‑native security observability guide using Alibaba Cloud services such as DDoS protection, WAF, CLB, SAS, and SLS for logging, monitoring, anomaly detection, and alert response.

AI securityAlibaba CloudCloud Native
0 likes · 15 min read
How DeepSeek’s Attack Highlights the Need for Robust Cloud‑Native Security Observability