Running Kubernetes Across Multiple Failure Zones
This article explains how Kubernetes clusters can be deployed across multiple failure zones and regions, detailing control plane replication, node labeling, pod topology constraints, storage zone awareness, network considerations, and disaster recovery strategies to achieve high availability in cloud‑native environments.
Background
Kubernetes is designed so that a single cluster can operate across multiple failure zones, which are logical groupings within a region. Major cloud providers define a region as a set of failure zones (also called availability zones) that provide consistent APIs and services.
Typical cloud architectures aim to minimize the chance that a failure in one zone will affect services in another zone.
Control Plane Behavior
All control‑plane components run as a pool of interchangeable resources, each replicated. When deploying the cluster control plane, place replicas of each component (API server, scheduler, etcd, controller‑manager) across multiple failure zones—ideally at least three zones. If using a cloud‑controller‑manager, replicate it in all chosen zones as well.
Note: Kubernetes does not provide cross‑region elasticity for the API server endpoint. Techniques such as DNS round‑robin, SRV records, or third‑party load balancers with health checks can improve API server availability.
Node Behavior
Kubernetes automatically spreads workload pods (e.g., Deployments, StatefulSets) across different nodes to reduce impact of failures.
When a node starts, its kubelet adds labels to the node object, which can include zone information.
If the cluster spans multiple zones, you can combine node labels with pod topology spread constraints to control how pods are distributed across fault domains (zones, regions, or specific nodes). This helps the scheduler place pods for better expected availability.
For example, you can declare a constraint ensuring that the three replicas of a StatefulSet run in three distinct zones without explicitly specifying each zone.
Cross‑Zone Node Distribution
Kubernetes does not create nodes for you; you must provision them yourself or use tools like Cluster API to manage node creation and automatic repair across failure domains.
Manual Zone Assignment for Pods
You can apply node‑selector constraints to Pods or to the pod templates of workload resources (Deployments, StatefulSets, Jobs).
Zone‑Aware Storage Access
When a PersistentVolume is created, the PersistentVolumeLabel controller automatically adds a zone label to the volume. The scheduler then uses the NoVolumeZoneConflict predicate to ensure that a pod claiming that volume is scheduled in the same zone.
You can specify a StorageClass for PersistentVolumeClaims that defines which failure zones the storage may reside in. Refer to allowed topology documentation for configuring zone‑aware StorageClasses.
Network
Kubernetes itself is not zone‑aware for networking. You can use network plugins that may have zone‑specific elements. For example, if your cloud provider supports a Service of type=LoadBalancer, the load balancer may route traffic only to pods in the same zone as the load‑balancer endpoint.
Custom or on‑prem deployments also need to consider similar issues; service and ingress behavior across zones depends on how the cluster is set up.
Failure Recovery
When setting up the cluster, consider how to recover if all failure zones within a region become unavailable. Ensure that critical recovery jobs do not depend on having at least one healthy node in the cluster; design special‑tolerance jobs that can run even when no nodes are initially healthy.
Kubernetes does not provide a built‑in solution for this scenario, but it is an important consideration for high‑availability designs.
Architects Research Society
A daily treasure trove for architects, expanding your view and depth. We share enterprise, business, application, data, technology, and security architecture, discuss frameworks, planning, governance, standards, and implementation, and explore emerging styles such as microservices, event‑driven, micro‑frontend, big data, data warehousing, IoT, and AI architecture.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.