Tag

Failover

2 views collected around this technical thread.

Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Jun 9, 2025 · Operations

How Nginx Master‑Slave Architecture Ensures High Availability

This article explains how Nginx's master‑slave (primary‑backup) setup, combined with Keepalived and a virtual IP, provides high‑availability for web and API services by automatically detecting failures, shifting the VIP, and allowing the backup server to take over without service interruption.

FailoverHigh AvailabilityKeepalived
0 likes · 4 min read
How Nginx Master‑Slave Architecture Ensures High Availability
php中文网 Courses
php中文网 Courses
May 26, 2025 · Backend Development

Implementing Load‑Balancer‑Like Auto‑Decision Logic in PHP Applications

This article explores how to embed load‑balancer concepts such as intelligent request distribution, health checks, automatic failover, and dynamic strategy adjustment directly into PHP applications using algorithms like weighted round‑robin, response‑time balancing, and circuit‑breaker patterns, providing code examples and practical deployment scenarios.

BackendFailoverPHP
0 likes · 11 min read
Implementing Load‑Balancer‑Like Auto‑Decision Logic in PHP Applications
Sanyou's Java Diary
Sanyou's Java Diary
Feb 20, 2025 · Databases

How Redis Sentinel Ensures Automatic Failover and High Availability

Redis Sentinel provides a robust high‑availability solution by monitoring master‑slave clusters, automatically detecting failures, electing leaders, and performing failover, while using quorum voting, Pub/Sub communication, and configuration provisioning to ensure seamless master promotion and client redirection without manual intervention.

DatabaseFailoverHigh Availability
0 likes · 16 min read
How Redis Sentinel Ensures Automatic Failover and High Availability
IT Architects Alliance
IT Architects Alliance
Jan 7, 2025 · Cloud Computing

Elastic Architecture: Auto Scaling and Failover for Resilient Systems

The article explains how elastic architecture, through auto‑scaling and failover mechanisms, dynamically adjusts resources and ensures continuous service during traffic spikes and component failures, improving cost efficiency, reliability, and operational stability for modern cloud‑based applications.

Elastic ArchitectureFailoverOperations
0 likes · 16 min read
Elastic Architecture: Auto Scaling and Failover for Resilient Systems
Aikesheng Open Source Community
Aikesheng Open Source Community
Jan 7, 2025 · Databases

Analysis of Redis Sentinel Failover Issue in Redis 7.4.0 and Resolution via Pub/Sub ACL Adjustment

This article investigates a Redis Sentinel failover anomaly in version 7.4.0 where the sentinel repeatedly elects a failed master, explains the underlying s_down/o_down states, examines network, configuration, and ACL settings, and resolves the issue by adjusting Pub/Sub permissions to allow proper failover.

ACLDatabaseFailover
0 likes · 11 min read
Analysis of Redis Sentinel Failover Issue in Redis 7.4.0 and Resolution via Pub/Sub ACL Adjustment
Beijing SF i-TECH City Technology Team
Beijing SF i-TECH City Technology Team
Jun 18, 2024 · Databases

Design and Implementation of MySQL High Availability Using Orchestrator and DBProxy

This article presents a comprehensive design and implementation for achieving MySQL high availability by replacing the single‑master architecture with Orchestrator‑driven automatic failover, integrating DBProxy for transparent routing, and addressing topology changes and data compensation to ensure continuous, reliable service.

DBProxyData CompensationFailover
0 likes · 16 min read
Design and Implementation of MySQL High Availability Using Orchestrator and DBProxy
Architecture & Thinking
Architecture & Thinking
Apr 10, 2024 · Operations

How Redis Sentinel Ensures Automatic Failover and High Availability

Redis Sentinel provides automatic monitoring, fault detection, and failover for Redis master‑slave clusters, enabling high availability by electing a new master when the original fails, using sdown/odown states, quorum voting, and pub/sub communication to keep services running with minimal downtime.

FailoverHigh AvailabilityMonitoring
0 likes · 11 min read
How Redis Sentinel Ensures Automatic Failover and High Availability
Bilibili Tech
Bilibili Tech
Feb 20, 2024 · Backend Development

Investigation and Optimization of Unexpected AAAA DNS Requests in Go Applications

The article investigates why Go applications unexpectedly send AAAA DNS queries to a secondary nameserver, tracing the issue to the built‑in resolver’s handling of non‑recursive responses from a NetScaler proxy, and recommends using the cgo resolver, enabling recursion, or forcing IPv4 to eliminate the added latency.

DNSFailoverGo
0 likes · 14 min read
Investigation and Optimization of Unexpected AAAA DNS Requests in Go Applications
Top Architect
Top Architect
May 5, 2023 · Backend Development

Using Redis Sentinel for High Availability: Design and Implementation

This article introduces Redis Sentinel as the official high‑availability solution for Redis, explains its core functions, provides configuration examples, compares three ways to receive failover notifications (script, client subscription, and indirect service), and offers design recommendations for robust production deployments.

BackendDevOpsFailover
0 likes · 10 min read
Using Redis Sentinel for High Availability: Design and Implementation
IT Architects Alliance
IT Architects Alliance
May 4, 2023 · Backend Development

Designing Redis High Availability with Sentinel

This article explains how to use Redis Sentinel for high‑availability deployments, covering its core features, configuration files, startup commands, monitoring behavior, three methods of receiving failover notifications, and recommended architectural patterns for robust backend systems.

BackendConfigurationFailover
0 likes · 9 min read
Designing Redis High Availability with Sentinel
Sohu Tech Products
Sohu Tech Products
Jan 18, 2023 · Big Data

Root Cause Analysis of Flink TaskManager Failover Causing Data Reprocessing and Business Impact

An incident report details how a scheduled machine reboot on Alibaba Cloud triggered a Flink TaskManager failover, leading to excessive data replay, increased ES pressure, and significant business latency, and explains the root cause involving disabled checkpoints and timestamp‑based offset consumption.

BigDataCheckpointFailover
0 likes · 10 min read
Root Cause Analysis of Flink TaskManager Failover Causing Data Reprocessing and Business Impact
Inke Technology
Inke Technology
Dec 19, 2022 · Backend Development

How to Build a Highly Available, Stable, and Observable SMS Service

This article explains how to design a high‑availability SMS system by identifying stability bottlenecks, defining reliability goals, implementing failover strategies for Redis, MySQL and external services, establishing a comprehensive observability framework, and measuring key quality metrics to ensure 99.99% uptime.

BackendFailoverHigh Availability
0 likes · 11 min read
How to Build a Highly Available, Stable, and Observable SMS Service
Aikesheng Open Source Community
Aikesheng Open Source Community
Nov 24, 2022 · Databases

Understanding Orchestrator's RegroupReplicasGTID and Candidate Replica Selection in MySQL Failover

This article explains how Orchestrator selects a candidate replica during MySQL master failover, detailing the GetCandidateReplica and RegroupReplicasGTID functions, their sorting logic, promotion rules, GTID-based regrouping, and differences from MHA, while highlighting potential data loss issues and related bugs.

FailoverGTIDMySQL
0 likes · 22 min read
Understanding Orchestrator's RegroupReplicasGTID and Candidate Replica Selection in MySQL Failover
Aikesheng Open Source Community
Aikesheng Open Source Community
Nov 17, 2022 · Databases

DeadMaster Recovery Process in Orchestrator

This article explains the complete DeadMaster recovery workflow of Orchestrator, detailing how the system selects the appropriate check‑and‑recover function, handles emergency grace periods, reads topology information, registers recovery attempts, validates promotion constraints, executes the actual failover, and runs post‑recovery hooks, with extensive Go code examples.

FailoverGoMySQL
0 likes · 18 min read
DeadMaster Recovery Process in Orchestrator
Aikesheng Open Source Community
Aikesheng Open Source Community
Nov 7, 2022 · Databases

Orchestrator Failover Process Source Code Analysis – Simulating Faults and Understanding ContinuousDiscovery

This article walks through a simulated MySQL 3307 cluster failure, examines Orchestrator's source code to explain the ContinuousDiscovery loop, discovery queues, health ticks, caretaking tasks, raft coordination, topology snapshots, and the logic distinguishing UnreachableMaster from DeadMaster states.

ContinuousDiscoveryDatabase HAFailover
0 likes · 20 min read
Orchestrator Failover Process Source Code Analysis – Simulating Faults and Understanding ContinuousDiscovery
Aikesheng Open Source Community
Aikesheng Open Source Community
Sep 22, 2022 · Databases

Using Orchestrator for Automatic MySQL Cluster Failover: Configuration and Test Cases

This article demonstrates how to configure the open-source Orchestrator tool for automatic MySQL cluster failover, explains key parameters, and presents three test cases covering normal failover, lag‑induced prevention, and the effect of disabling global recoveries.

Cluster ManagementDatabase OperationsFailover
0 likes · 6 min read
Using Orchestrator for Automatic MySQL Cluster Failover: Configuration and Test Cases
Sohu Tech Products
Sohu Tech Products
Sep 21, 2022 · Backend Development

Understanding Kafka Partition Failover When a Broker Goes Offline

This article analyzes a real‑world Kafka outage caused by killing a broker process, explains why partitions with a replication factor of one lose their leader, and walks through the internal Zookeeper‑based failover mechanism and leader‑election logic that Kafka uses to recover from such failures.

BackendFailoverKafka
0 likes · 10 min read
Understanding Kafka Partition Failover When a Broker Goes Offline
Practical DevOps Architecture
Practical DevOps Architecture
Jun 28, 2022 · Operations

Understanding Redis Sentinel: High‑Availability Mechanism and Failover Process

The article explains how Redis Sentinel provides high availability by monitoring master‑slave instances, detecting failures through periodic pings, distinguishing subjective and objective down states, performing quorum arbitration, and automatically promoting a slave to master to ensure continuous service.

FailoverHigh AvailabilityMaster‑Slave
0 likes · 4 min read
Understanding Redis Sentinel: High‑Availability Mechanism and Failover Process
Wukong Talks Architecture
Wukong Talks Architecture
Jun 8, 2022 · Databases

Deploying Redis Master‑Slave Architecture and Sentinel Cluster for High Availability

This guide walks through upgrading a single‑node Redis deployment to a high‑availability setup by building a master‑slave cluster, configuring Sentinel services, testing replication and failover, and enabling client auto‑detection of master changes, all using Docker containers and configuration files.

DockerFailoverHigh Availability
0 likes · 15 min read
Deploying Redis Master‑Slave Architecture and Sentinel Cluster for High Availability