Operations 14 min read

Comprehensive Overview of Data Center Active‑Active (Dual‑Active) Solutions

This article provides an in‑depth technical overview of data‑center active‑active architectures, covering network interconnects, storage SAN/Fibre Channel links, application clustering, arbitration mechanisms, gateway‑based designs, technical requirements, and practical limitations for achieving end‑to‑end high availability.

Architects' Tech Alliance

Apr 24, 2016

Comprehensive Overview of Data Center Active‑Active (Dual‑Active) Solutions

First, the author thanks readers for supporting the paid article on data deduplication and for participating in the recent poll.

Because many readers have been discussing data‑center dual‑active solutions, this article aggregates common questions and the author’s insights into a comprehensive guide, noting that future posting frequency may be reduced due to work commitments.

Dual‑active (active‑active) in a data‑center context refers to end‑to‑end active‑active status for applications, networks, storage, and data. While some components may be deployed in HA mode and others in single‑point mode, the overall solution aims for full dual‑active capability. A typical dual‑active network diagram based on an array is shown below.

Data Center Interconnect Network

Data centers A and B are linked via an inter‑data‑center network, while each center uses a traditional two‑ or three‑tier architecture. The access layer connects to business servers, and the core/aggregation layer connects to the remote data center using large‑scale two‑layer technologies (CSS+iStack). This interconnect supports live VM migration with unchanged MAC addresses and overcomes VLAN count limits; most vendors support Trill for this purpose.

Data Center Internal Interconnect

Storage, switches, and servers are connected through a dedicated SAN network with redundant paths. Switches between the two data centers use Fibre Channel (FC) interconnects—not necessarily optical fiber—to provide real‑time data synchronization and heartbeat communication.

Active‑Active Application Deployment

Application clusters synchronize data via the large‑scale two‑layer network. Common enterprise clusters include VMware, Hyper‑V, Oracle RAC, SQL MSCS/MSFC, IBM DB2/PureScale, with Oracle RAC and PureScale being true active‑active clusters.

Active‑Active Arbitration Deployment

Third‑party arbitration is often provided by a storage‑cluster arbitration server, offering a low‑cost solution that can host virtual machines for HA. Examples include EMC VMAX3 arbitration; the author notes that after EMC’s acquisition by Dell, this strategy may evolve.

Although third‑party arbitration does not always require a separate third site, many customers prefer it for reliability. Priority‑site strategies can be used when the arbitration node fails, but they carry risk if the priority site also fails, making third‑party arbitration the safer choice.

Server application clusters also need arbitration, typically requiring only IP‑level reachability rather than the large‑scale two‑layer network.

External Access to Active‑Active Applications

End‑users access resources over the Internet, passing through local caches, Global NDS, and DNS resolution. Load balancing is achieved with GSLB and SLB, which synchronize IP resources between data centers and can return the lowest‑RTT IP address to the client.

Gateway‑Based Active‑Active

Due to space constraints, detailed gateway designs are omitted, but the author references Huawei’s VIS storage‑gateway dual‑active solution (diagram shown below).

Gateway dual‑active adds hardware, increasing cost and potential failure points, and may become a performance bottleneck. Adding more gateway nodes (supported by VPLEX, SVC, VIS) mitigates this issue. Gateways also handle storage failover and data synchronization, reducing storage performance pressure by using volume mirroring.

Basic Technical Conditions for Active‑Active

Two essential conditions are required: (1) real‑time data replicas so that a failure on one side can be served by an identical copy, and (2) automatic failover and recovery of server, storage, and network clusters.

Application‑Layer Active‑Active

Examples include Oracle RAC, IBM GPFS, Symantec SVC, PowerHA HyperSwap, and Huawei VIS. The author links to a separate article on PowerHA HyperSwap. Below is an illustration of the IBM GPFS dual‑active solution.

IBM GPFS uses IO Failure Group technology for data replica protection and an active‑active cluster for failover. Application‑layer dual‑active is less common because creating volume mirrors and synchronizing data at the server level can heavily impact applications.

NAS‑Based Active‑Active

Most dual‑active solutions are SAN‑based due to the high performance and reliability requirements of workloads such as databases, ERP, and SAP. However, NAS dual‑active is possible (e.g., NetApp FAS, IBM GPFS) because some databases (Oracle RAC, IBM PureScale) can run directly on NAS.

Limitations and Requirements of Active‑Active

Dual‑active is the highest‑level disaster‑recovery solution and imposes strict requirements:

• Distance: Typically 100‑300 km between sites to maintain strong consistency; distances >30 km require DWDM optical repeaters, with a maximum of about 3000 km.

• Network: Low latency, sufficient bandwidth, and low bit‑error rate are essential to support real‑time replication.

• Performance: Both data centers must have comparable hardware capabilities; gateways must not become bottlenecks.

True active‑active (both sites can read/write simultaneously) versus pseudo active‑passive depends on both storage and application support. If storage is active‑active but the application is not (e.g., VMware), the overall solution behaves as active‑passive.

Multi‑path: Storage‑based dual‑active often requires multi‑path routing. VMware provides a PSA interface for vendor‑specific multi‑path modules; other platforms like XenServer lack such interfaces and rely on native multi‑path support (e.g., ALUA).

In conclusion, dual‑active encompasses end‑to‑end active‑active for applications, networks, storage, and data. Application clusters can be built on top of virtualized environments (e.g., Oracle RAC on VMware VMs). The author invites further discussion and support for deeper technical exploration.

Finally, the author promotes the China Cloud Computing Conference, offering a discount code WEMEDIA1JJ for a 100‑yuan reduction, with details and ticket link provided.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Data Center Active-Active Network Interconnect

Written by

Architects' Tech Alliance

Sharing project experiences, insights into cutting-edge architectures, focusing on cloud computing, microservices, big data, hyper-convergence, storage, data protection, artificial intelligence, industry practices and solutions.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.