Tag

Federation

1 views collected around this technical thread.

Kuaishou Tech
Kuaishou Tech
Oct 31, 2024 · Cloud Native

Stateful Service Cloud‑Native Practices: Kuaishou’s Redis on Kubernetes

This article examines the challenges and benefits of running stateful services such as Redis on Kubernetes, presents Kuaishou’s practical experience with cloud‑native migration, evaluates risks and performance impacts, and details the custom workloads, operators, federation and KubeBlocks solutions that enable large‑scale, reliable stateful service orchestration.

Cloud NativeFederationKubeBlocks
0 likes · 12 min read
Stateful Service Cloud‑Native Practices: Kuaishou’s Redis on Kubernetes
DataFunSummit
DataFunSummit
Oct 17, 2024 · Big Data

Waggle Dance Based Metadata Solution at Tongcheng Travel: Architecture, Migration Strategies, and Future Outlook

This article presents Tongcheng Travel's metadata solution built on the open‑source Waggle Dance project, detailing the three‑layer architecture, challenges of a monolithic Hive Metastore, evaluated migration plans, federation implementation, migration workflow, and future directions for unified metadata governance.

Big DataFederationHive Metastore
0 likes · 11 min read
Waggle Dance Based Metadata Solution at Tongcheng Travel: Architecture, Migration Strategies, and Future Outlook
Ops Development Stories
Ops Development Stories
Jun 28, 2024 · Cloud Native

Multi-Cluster Kubernetes: Benefits, Federation, Karmada, and Practical Tips

This article explains why organizations adopt multi‑cluster Kubernetes for high availability, hybrid‑cloud scaling, and fault isolation, outlines the preparatory steps, compares Federation v1 and v2, introduces Karmada as a CNCF project, and shares practical non‑federated deployment, monitoring, traffic management, and migration techniques with code examples.

Cloud NativeDevOpsFederation
0 likes · 18 min read
Multi-Cluster Kubernetes: Benefits, Federation, Karmada, and Practical Tips
DevOps Operations Practice
DevOps Operations Practice
Mar 14, 2024 · Operations

Resolving Frequent Crashes of a Single-Node Prometheus Deployment: Analysis and Solutions

This article analyzes why a single Prometheus instance repeatedly runs out of memory and crashes, explains the underlying storage mechanisms, and presents practical solutions such as metric reduction, retention tuning, federation architecture, and remote storage integration to improve stability and scalability.

FederationMonitoringPerformance
0 likes · 6 min read
Resolving Frequent Crashes of a Single-Node Prometheus Deployment: Analysis and Solutions
Efficient Ops
Efficient Ops
Aug 28, 2022 · Cloud Native

Mastering Kubernetes Federation: Install, Join Clusters, and Sync Resources

This guide explains the purpose of Kubernetes Federation, its benefits for multi‑cluster management, step‑by‑step installation using Helm and kubefedctl, how to join and unjoin clusters, enable resource federation, and provides a cheat sheet of common commands for reliable cross‑cluster deployments.

FederationHelmkubectl
0 likes · 8 min read
Mastering Kubernetes Federation: Install, Join Clusters, and Sync Resources
Architect's Guide
Architect's Guide
Jun 26, 2022 · Backend Development

Building a Million‑Message‑Per‑Second RabbitMQ Service: Architecture, Scaling, and High Availability

This article explains how to design and operate a RabbitMQ cluster capable of handling millions of messages per second by describing RabbitMQ fundamentals, Google‑scale deployment, sharding and consistent‑hash plugins, high‑availability mirroring, federation, and integration with Spring AMQP, while also covering practical deployment scenarios and performance trade‑offs.

FederationHigh AvailabilityMessage Queue
0 likes · 23 min read
Building a Million‑Message‑Per‑Second RabbitMQ Service: Architecture, Scaling, and High Availability
IT Services Circle
IT Services Circle
Apr 3, 2022 · Cloud Native

Understanding Kubernetes Federation: kubefed and Karmada Multi‑Cluster Management

This article explains why Kubernetes single‑cluster scalability is limited to about 5,000 nodes, introduces the concept of multi‑cluster federation, compares the legacy kubefed project with the actively maintained Karmada solution, and shows how policies and replica‑scheduling enable flexible cross‑AZ deployments and failover.

Cloud NativeCluster ManagementFederation
0 likes · 13 min read
Understanding Kubernetes Federation: kubefed and Karmada Multi‑Cluster Management
IT Architects Alliance
IT Architects Alliance
Jan 14, 2022 · Operations

Scaling RabbitMQ to Million‑Message Throughput: Architecture, Plugins, and High‑Availability Practices

This article explains how to horizontally scale RabbitMQ clusters, use sharding and federation plugins, configure mirror queues and other high‑availability features, and apply practical patterns such as confirms, retries, and delayed delivery to achieve million‑level message throughput in production environments.

ClusteringFederationHigh Availability
0 likes · 23 min read
Scaling RabbitMQ to Million‑Message Throughput: Architecture, Plugins, and High‑Availability Practices
Architecture Digest
Architecture Digest
Jan 13, 2022 · Backend Development

Scaling RabbitMQ to Million‑Message Throughput: Architecture, Sharding, Federation, and High‑Availability Practices

This article explains how to horizontally scale RabbitMQ clusters to handle millions of messages per second by leveraging cluster modes, mirror queues, sharding plugins, consistent‑hash exchanges, federation, and high‑availability configurations, while also covering practical scenarios such as retries, delayed tasks, and Spring AMQP integration.

ClusteringFederationHigh Availability
0 likes · 22 min read
Scaling RabbitMQ to Million‑Message Throughput: Architecture, Sharding, Federation, and High‑Availability Practices
Ops Development Stories
Ops Development Stories
Nov 8, 2021 · Cloud Native

How to Manually Deploy Prometheus Federation on Kubernetes – Step‑by‑Step Guide

This guide walks through manually deploying a Prometheus federation on Kubernetes, covering environment setup with sealos, creating storage classes, persistent volumes, ConfigMaps, StatefulSets, services, applying manifests, and verifying the federation to aggregate metrics across multiple clusters.

FederationMonitoringPrometheus
0 likes · 10 min read
How to Manually Deploy Prometheus Federation on Kubernetes – Step‑by‑Step Guide
Efficient Ops
Efficient Ops
Jul 25, 2021 · Cloud Native

Why Enterprises Need Multi‑Cluster Kubernetes and How to Implement It

This article explains why modern enterprises adopt multiple Kubernetes clusters, covering single‑cluster capacity limits, hybrid‑cloud requirements, fault‑tolerance concerns, the benefits of multi‑cluster setups, architectural models, and community‑driven implementation patterns.

Cloud NativeFederationkubernetes
0 likes · 9 min read
Why Enterprises Need Multi‑Cluster Kubernetes and How to Implement It
Big Data Technology Architecture
Big Data Technology Architecture
Mar 11, 2021 · Big Data

Challenges and Optimizations of Hive MetaStore at Kuaishou

This article details how Kuaishou tackled performance, scalability, and stability challenges of Hive MetaStore by introducing a BeaconServer hook architecture, read‑write separation, API refinements, traffic control, and federation designs, resulting in significant query efficiency and service reliability improvements.

Big DataFederationHive
0 likes · 14 min read
Challenges and Optimizations of Hive MetaStore at Kuaishou
Cloud Native Technology Community
Cloud Native Technology Community
Mar 30, 2020 · Cloud Native

Building a Cloud‑Native Large‑Scale Distributed Monitoring System with Prometheus

This article explains how to design and implement a cloud‑native, large‑scale distributed monitoring system using Prometheus, covering its limitations, service‑level sharding, centralized storage, federation, and high‑availability strategies to overcome scaling challenges in Kubernetes environments.

Cloud NativeFederationHigh Availability
0 likes · 12 min read
Building a Cloud‑Native Large‑Scale Distributed Monitoring System with Prometheus
DataFunTalk
DataFunTalk
Jan 2, 2020 · Big Data

ByteDance’s HDFS Architecture and Evolution: Design, Challenges, and Optimizations

This article presents an in‑depth overview of ByteDance’s large‑scale HDFS deployment, describing its unique access layer, metadata and data layers, the evolution through multiple growth stages, and the key architectural improvements such as NNProxy, DanceNN, lock redesign, startup acceleration, and slow‑node mitigation techniques.

Big DataByteDanceDistributed Storage
0 likes · 18 min read
ByteDance’s HDFS Architecture and Evolution: Design, Challenges, and Optimizations
Beike Product & Technology
Beike Product & Technology
Jun 28, 2019 · Big Data

Hadoop NameNode Performance Bottlenecks and Solutions: Federation, ViewFS, FastCopy, Balance & Mover

This article analyzes the performance and stability bottlenecks of a Hadoop 2.7.3 NameNode caused by memory limits, RPC QPS, and long restart times, and presents a comprehensive solution stack—including HDFS federation, ViewFS, FastCopy, and tuned Balance/Mover tools—to improve scalability and reduce downtime.

BigDataFastCopyFederation
0 likes · 11 min read
Hadoop NameNode Performance Bottlenecks and Solutions: Federation, ViewFS, FastCopy, Balance & Mover
Qunar Tech Salon
Qunar Tech Salon
May 16, 2019 · Big Data

Optimizing HDFS Federation Data Migration with FastCopy and qFastCopy at Qunar

This article describes the challenges of scaling Qunar's Hadoop NameNode, introduces HDFS Federation and the FastCopy tool, presents performance tests comparing FastCopy with DistCp, and details the development and evaluation of an optimized qFastCopy solution that reduces multi‑petabyte migration time from hours to a few.

Big DataFastCopyFederation
0 likes · 8 min read
Optimizing HDFS Federation Data Migration with FastCopy and qFastCopy at Qunar
Art of Distributed System Architecture Design
Art of Distributed System Architecture Design
Nov 20, 2015 · Big Data

Design and Implementation of Alibaba Cloud's Cross‑Data‑Center Hadoop Cluster

In 2013 Alibaba Cloud faced full rack capacity in a single IDC, prompting the development of a multi‑NameNode, cross‑data‑center Hadoop solution that overcomes NameNode scalability, inter‑site bandwidth limits, data placement, job scheduling, massive data migration, and user transparency challenges.

Big DataCross-Data-CenterDistributed Storage
0 likes · 14 min read
Design and Implementation of Alibaba Cloud's Cross‑Data‑Center Hadoop Cluster