Tag

High Availability

1 views collected around this technical thread.

vivo Internet Technology
vivo Internet Technology
Apr 5, 2023 · Databases

Understanding MySQL Replication: Principles, Mechanisms, and Practical Applications

MySQL replication copies data changes from a primary server to one or more replicas using binlog events—supporting statement, row, or mixed formats and GTID positioning—to provide real‑time backup, read‑write separation, high‑availability failover, and integration pipelines via asynchronous, semi‑synchronous, or centralized binlog services.

Data ReliabilityGTIDHigh Availability
0 likes · 30 min read
Understanding MySQL Replication: Principles, Mechanisms, and Practical Applications
Bilibili Tech
Bilibili Tech
Feb 7, 2023 · Cloud Native

Bilibili Configuration Center (Config & Paladin): Architecture, Features, and Performance

Bilibili’s Config Center evolved from the 2017 Config v1 monolith—offering unified UI, MySQL storage, and long‑polling—to the Raft‑based Paladin v2, which adds lifecycle management, tenant isolation, incremental publishing, high‑throughput caching, multi‑active deployment, validation and rich tooling, handling hundreds of thousands of configs and tens of thousands of concurrent clients with sub‑50 ms push latency while planning deeper K8s integration.

Cloud NativeConfiguration ManagementDistributed Systems
0 likes · 15 min read
Bilibili Configuration Center (Config & Paladin): Architecture, Features, and Performance
Bilibili Tech
Bilibili Tech
Sep 30, 2022 · Databases

Database Failure Management: Types, Mitigation Strategies, and Bilibili’s Practices

The article outlines common database and cache failures—such as instance outages, replication lag, data corruption, and cache avalanches—while detailing Bilibili’s mitigation strategies including high‑availability architectures, scaling, multi‑active designs, proxy controls, slow‑query alerts, fault‑injection drills, and ongoing resilience improvements.

BilibiliDatabaseHigh Availability
0 likes · 17 min read
Database Failure Management: Types, Mitigation Strategies, and Bilibili’s Practices
vivo Internet Technology
vivo Internet Technology
Jun 22, 2022 · Information Security

Jump Server Architecture and Implementation Using Linux PAM for Secure Access

The article describes a PAM‑based jump‑server architecture that securely proxies SSH, RDP, and other terminal access without storing credentials, using stateless micro‑services and a custom jmp.so module on each host to intercept authentication, enforce permission rules, and block dangerous commands.

Access ControlHigh AvailabilityLinux PAM
0 likes · 18 min read
Jump Server Architecture and Implementation Using Linux PAM for Secure Access
HelloTech
HelloTech
May 13, 2022 · Backend Development

Redis Dual-Active Architecture: Hot-Standby, Dual-Write, and Bidirectional Synchronization Comparison

This article compares Redis dual‑active designs—hot‑standby, various dual‑write models, and bidirectional synchronization—showing hot‑standby’s high cost, dual‑write’s latency or consistency trade‑offs, and arguing that a middleware‑driven bidirectional sync, using replication protocols and fixed‑prefix keys to avoid loops, offers the most practical solution.

Backend DevelopmentClusterDual-Active Architecture
0 likes · 14 min read
Redis Dual-Active Architecture: Hot-Standby, Dual-Write, and Bidirectional Synchronization Comparison
HelloTech
HelloTech
Jul 12, 2021 · Operations

Introduction to System Stability: Concepts, Metrics, and Practices

The article explains Haro’s approach to system stability—defining high‑availability, key metrics such as SLA, RPO/RTO, MTTR/MTBF, and the 5‑5‑10 rule—while outlining cultural and technical safeguards, full‑team participation, process integration, and incremental tooling to prevent faults and ensure rapid recovery.

High AvailabilityMTTROperations
0 likes · 11 min read
Introduction to System Stability: Concepts, Metrics, and Practices
vivo Internet Technology
vivo Internet Technology
May 12, 2021 · Big Data

Kafka at Trillion-Scale: Ensuring High Availability, Performance, and Operational Best Practices

The article presents a comprehensive guide for running Kafka at trillion‑record daily traffic, detailing version upgrades, data migration, traffic throttling, monitoring, load balancing, resource isolation, security, disaster recovery, Linux tuning, platform automation, performance evaluation, future roadmap, and community contribution practices.

High AvailabilityKafkaLoad Balancing
0 likes · 34 min read
Kafka at Trillion-Scale: Ensuring High Availability, Performance, and Operational Best Practices
vivo Internet Technology
vivo Internet Technology
Oct 14, 2020 · Backend Development

Design and Implementation of a High‑Availability RabbitMQ Middleware Platform at vivo

vivo built a high‑availability RabbitMQ middleware platform that combines an MQ‑Portal for request‑driven provisioning, an SDK that adds application‑level authentication, automatic cluster discovery, rate‑limiting, reset and blockage‑transfer capabilities, and a stateless MQ‑NameServer for name resolution and health‑based failover, enabling ten‑fold traffic growth without incidents.

High AvailabilityMessage QueueRabbitMQ
0 likes · 14 min read
Design and Implementation of a High‑Availability RabbitMQ Middleware Platform at vivo
Tencent Cloud Developer
Tencent Cloud Developer
Aug 18, 2020 · Cloud Native

Kubernetes High Availability: Architecture, Network, Storage, and Application Strategies

The article explains how to achieve Kubernetes high availability by designing a three‑node control‑plane with stacked etcd, using pod anti‑affinity, tuning node‑monitor timers, handling stale endpoints, configuring TCP keep‑alive, managing node taints and eviction, and choosing RWX storage or appropriate StatefulSet strategies to minimize service disruption after node failures.

ClusterHigh AvailabilityKubernetes
0 likes · 21 min read
Kubernetes High Availability: Architecture, Network, Storage, and Application Strategies
Tencent Cloud Developer
Tencent Cloud Developer
Aug 4, 2020 · Cloud Computing

Tencent Cloud Elasticsearch Optimization Practices in Tencent Meeting: High Availability, Performance, and Cost-Effective Solutions

Tencent Meeting migrated its quality‑analysis system to Tencent Cloud Elasticsearch, tackling OOM failures, 3 M/s write spikes and scaling limits by adding multi‑AZ deployment, leaky‑bucket rate limiting, streaming aggregation checks, optimized merge and translog handling, plus hot‑warm storage, ILM, multi‑disk and off‑heap caching, cutting cluster size from 15 000 to under 300 nodes while maintaining high availability and performance.

Cost OptimizationData EngineeringDistributed Systems
0 likes · 23 min read
Tencent Cloud Elasticsearch Optimization Practices in Tencent Meeting: High Availability, Performance, and Cost-Effective Solutions
Tencent Cloud Developer
Tencent Cloud Developer
Apr 5, 2020 · Databases

Database Operations, Optimization, High Availability and Self‑Service – Insights from DBA Yang Jianrong

Senior DBA Yang Jianrong shares how standardized processes, robust security, and modern optimization—such as partitioning, middleware, and NoSQL—combined with high‑availability designs and self‑service tools like automated slow‑log analysis can streamline large‑scale MySQL operations, migrations, and continuous DBA learning.

BackupDBADatabase
0 likes · 24 min read
Database Operations, Optimization, High Availability and Self‑Service – Insights from DBA Yang Jianrong