Cloud Native 14 min read

How to Build High‑Availability Kubernetes Clusters with Volcengine VKE & VCI

This guide explains how Volcengine's VKE (Kubernetes Engine) and VCI (Elastic Container Instance) enable high‑availability, multi‑AZ deployments, covering cluster creation, control‑plane distribution, virtual node configuration, inventory‑aware scheduling, and practical YAML examples for resilient cloud‑native workloads.

ByteDance Cloud Native
ByteDance Cloud Native
ByteDance Cloud Native
How to Build High‑Availability Kubernetes Clusters with Volcengine VKE & VCI

Over the past decade, digital transformation has driven industries such as finance, retail, manufacturing, telecom, healthcare, and automotive to rely on digital services and infrastructure, making continuous availability critical for business continuity and societal stability.

Volcengine's cloud‑native products, built on ByteDance's extensive experience, provide high‑elasticity, high‑availability, and seamless operation for large‑scale traffic events. The Volcengine Kubernetes Engine (VKE) offers a container‑centric, high‑performance managed Kubernetes service, while the Elastic Container Instance (VCI) integrates Serverless capabilities for on‑demand, pay‑as‑you‑go resource consumption.

Common concerns include improving cluster availability, leveraging multi‑AZ deployments, and ensuring rapid provisioning of Serverless elastic resources.

Building a High‑Availability VKE Cluster

Distribute the control plane across multiple Availability Zones (AZs) by selecting a high‑availability cluster and creating subnets in at least three AZs.

Deploy business Pods across these AZs using corresponding Pod subnets.

This architecture prevents single‑AZ failures from causing service interruptions.

Step‑by‑Step Configuration

Log in to the Container Service console.

Navigate to the Cluster section.

Click “Create Cluster” and configure parameters.

Select the latest Kubernetes version.

Choose control‑plane subnets in three different AZs.

Choose Pod subnets in three or more AZs (ensure sufficient IP capacity).

Optionally create node pools; VKE supports node, Serverless, and hybrid pool types.

For node pools, select subnets across three AZs and enable a balanced strategy so nodes are spread evenly.

VCI Virtual Node High‑Availability Configuration

Install the vci-virtual-kubelet component to enable virtual nodes.

Optionally add CSI, Ingress Nginx, logging, and monitoring components.

Virtual nodes use a specific taint (

vci.vke.volcengine.com/node-type=vci:NoSchedule

) and a label (

node.kubernetes.io/instance-type=virtual-node

). To schedule Pods on virtual nodes, add the annotation

vke.volcengine.com/burst-to-vci: enforce

, which the webhook maps to node selectors and tolerations.

Existing Cluster HA Refactoring

If a cluster was created without three AZ subnets, add control‑plane subnets for the missing AZs; the API server will roll‑restart, distributing control‑plane components across AZs.

For VCI virtual nodes, configure subnets in each AZ; missing subnets appear as “Pending Pod Subnet” and can be added via the console.

General‑Purpose Compute Spec (u1)

VCI offers a “u1” spec that abstracts CPU generation differences and provides inventory‑aware scheduling based on actual resource levels, improving performance for workloads insensitive to CPU generations.

Specify the spec via the Pod annotation

vci.vke.volcengine.com/preferred-instance-family: vci.u1

.

YAML Example for VCI Deployment

<code>apiVersion: apps/v1
kind: Deployment
metadata:
  name: test-vci
  namespace: default
spec:
  replicas: 3
  selector:
    matchLabels:
      app: test-vci
  template:
    metadata:
      annotations:
        vci.vke.volcengine.com/preferred-instance-family: vci.u1
        vci.volcengine.com/tls-enable: "true"
        vke.volcengine.com/burst-to-vci: enforce
      labels:
        app: test-vci
    spec:
      containers:
      - name: test
        image: nginx
      topologySpreadConstraints:
      - labelSelector:
          matchLabels:
            app: test-vci
        maxSkew: 1
        topologyKey: kubernetes.io/hostname
        whenUnsatisfiable: ScheduleAnyway
</code>

To distribute Pods across subnets, add the annotation

vke.volcengine.com/preferred-subnet-ids

with a comma‑separated list of subnet IDs.

Inventory‑Aware Scheduling

Enable the

vci-virtual-kubelet

and

scheduler-plugin

components and turn on the inventory‑aware scheduling switch. The scheduler then queries VCI resource stock in each AZ and places Pods on zones with sufficient capacity; if all zones lack stock, Pods remain pending until resources become available.

For more virtual node usage, refer to the product documentation.

In conclusion, as cloud adoption continues globally, ensuring business continuity through robust high‑availability designs remains a long‑term challenge. Volcengine’s cloud‑native team aims to provide reliable solutions that help enterprises fully leverage cloud resources.

Cloud NativeserverlessHigh AvailabilityKubernetesContainer ServiceVolcengine
ByteDance Cloud Native
Written by

ByteDance Cloud Native

Sharing ByteDance's cloud-native technologies, technical practices, and developer events.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.