How Vivo Scales Kubernetes: Automated Multi‑Cluster Management with a Custom Operator
Vivo’s rapid migration to Kubernetes across multiple data centers required a secure, efficient, and reliable way to manage thousands of nodes, leading them to develop a custom k8s‑operator that streamlines cluster deployment, CI testing, declarative APIs, and automated repair for large‑scale cloud‑native environments.
Vivo Large‑Scale Kubernetes Cluster Automation Practices
As Vivo’s services migrate to Kubernetes, the need to deploy K8s across multiple data centers has grown, presenting challenges of safety, efficiency, and reliability in managing thousands of nodes.
At the 2022 GOPS Global Operations Conference in Shenzhen, a senior R&D engineer from Vivo presented the company’s self‑developed k8s‑operator , which includes cluster deployment optimization, CI matrix testing, declarative APIs, and an operator architecture designed for massive scale.
The solution enables a single cluster administrator to create and manage thousands of Kubernetes nodes, offering a highly scalable, declarative, and self‑healing cloud‑native system ideal for unified management of Vivo’s large‑scale clusters.
Speaker: Zhang Rong, Senior R&D Engineer, Vivo Internet, focusing on Kubernetes scheduling, cluster management, and GPU technologies.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.