Cloud Native 13 min read

Design and Challenges of Multi‑Active Architecture in Hybrid Cloud Environments

This article examines the design principles, challenges, and implementation details of a multi‑active architecture for hybrid cloud environments, covering stability, cost, efficiency, network topology, container orchestration, service discovery, traffic scheduling, and data storage, and outlines practical solutions used by the Zuoyebang platform.

IT Architects Alliance
IT Architects Alliance
IT Architects Alliance
Design and Challenges of Multi‑Active Architecture in Hybrid Cloud Environments

Enterprises adopt hybrid‑cloud solutions primarily for stability and cost efficiency, and the ultimate pursuit of these goals leads to the adoption of a multi‑active (multi‑active‑cloud) architecture.

In the early stages of business exploration, a single‑cloud single‑active design is common, but as workloads grow, a single‑active setup cannot meet stability requirements, prompting deployment across multiple availability zones and eventually multiple clouds.

Cost and service considerations drive organizations to engage multiple cloud providers for disaster‑recovery, peak‑load elasticity, and workload segmentation, aiming for true multi‑active deployments that allow traffic and capacity scheduling across clouds.

The multi‑active approach brings significant challenges: stability can degrade if inter‑cloud dependencies are not fully closed, cost may increase due to redundant capacity, and operational efficiency can suffer because managing thousands of services across clouds exceeds manual double‑check capabilities.

Design goals focus on achieving high stability and cost‑effectiveness through a multi‑cloud active strategy, leveraging Kubernetes’s north‑bound APIs to unify deployment and operation across clouds.

The overall architecture is divided into a resource layer (IaaS), a PaaS layer (databases, messaging, security, big‑data services), a business‑middle‑platform layer, and the final business line, with containers and orchestration providing resource abstraction.

Network design adopts a “multi‑cloud networking + CPE control” solution, offering elastic bandwidth, cross‑cloud traffic observability, automatic fault‑tolerant switching, and rapid onboarding of new cloud providers.

Compute management standardizes instance specifications to avoid an explosion of machine types, enabling controlled lifecycle management through scenario‑based packages.

Container technology serves as the core for masking IaaS differences, while a standardized container middleware layer ensures consistent capabilities across all major cloud providers.

Service registration and discovery mechanisms are redesigned to enable seamless migration and stable operation in a hybrid‑cloud environment, with both synchronous RPC and asynchronous calls considered.

Unified service observation—including logging, monitoring, and tracing—provides a single view that mitigates the impact of hybrid‑cloud complexity.

Traffic scheduling focuses on north‑south flow, using domain‑level routing; DNS alone is insufficient, so a custom DoH solution built on CoreDNS is employed to achieve sub‑1% traffic deviation and sub‑5‑minute recovery after a cloud outage.

Data storage in multi‑cloud scenarios faces the classic CAP dilemma; depending on business needs, the team selects either AP or CP configurations, employing master‑slave, unit‑based, or MGR patterns as appropriate.

At the application layer, an “isolation zone + inter‑connect zone” design prevents unnecessary cross‑cloud calls while allowing flexible traffic routing during migrations or cloud failures.

In summary, building a hybrid‑cloud multi‑active architecture requires close collaboration among SYS, container, middleware, SRE, DBA, DevOps, FinOps, and security teams, forming a robust enterprise‑wide solution that spans resources, platforms, and applications.

cloud nativeoperationsCost OptimizationReliabilityhybrid-cloudmulti-active architecture
IT Architects Alliance
Written by

IT Architects Alliance

Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.