Design and Implementation of a Cloud‑Native MySQL Container Platform for High Availability and Resource Efficiency
The article describes how a bank built a Kubernetes‑based, containerized MySQL service platform (CDD) to improve database high availability, resource utilization, automated operations, and agile delivery by addressing network, storage, scheduling, and management challenges through custom networking, hybrid storage, scheduler extensions, and multi‑AZ deployment.
Background After the launch of the ACS private cloud, the rapid growth of MySQL RDS services and increasing business continuity requirements prompted the need for a robust cloud‑MySQL support system. Existing VM‑based single‑instance MySQL suffered from unstable environments, low resource utilization, and high operational costs.
Goal To align with the data‑center cloud strategy, the team aimed to build a self‑controlled, highly available, and automated MySQL cloud service that improves delivery speed, stability, and resource efficiency.
Challenges The migration to containers introduced difficulties in network performance, storage I/O, scheduling of stateful workloads, resource‑utilization metrics, and management tooling integration.
Solution Overview
1. Network The team evaluated common CNI plugins (Flannel, OpenvSwitch) and found tunnel‑based solutions unsuitable for high‑throughput database traffic. They adopted SR‑IOV hardware virtualization for high‑performance networking, complemented by MacVLAN for low‑end workloads, and designed three network modes (Overlay, MacVLAN, SR‑IOV) to match workload requirements.
2. Storage Traditional distributed filesystems (GlusterFS, CephFS, HDFS) were unsuitable for heavy‑IO MySQL workloads. The platform uses local storage combined with optional SAN or S2D hybrid storage to provide high I/O throughput and high availability.
3. Scheduling A custom scheduler extender (CDD‑Scheduler) was developed to evenly distribute MySQL primary nodes, network types, and storage classes across physical hosts, and to use monitoring data (Prometheus, Zabbix) for workload‑aware placement.
4. Resource Utilization The platform enforces a moderate over‑commit policy (~150% for containers) while protecting CPU, memory, network, and storage with circuit‑breaker mechanisms to avoid performance degradation.
5. Management Integration with a DBaaS layer provides unified APIs, automatic rule‑based resource allocation, and seamless connection to existing monitoring and alerting systems.
Architecture Highlights
The CDD platform includes a custom K8s operator for rapid DB deployment, a flat‑network design, a proprietary network plugin supporting cross‑segment IP allocation, a container‑level HA daemon, SR‑IOV‑based fixed IPs, hybrid storage plugins, and automated backup/monitoring integration.
Application Promotion
Since Q4 2019, the CDD platform has delivered over 3,000 MySQL MGR clusters to the bank’s headquarters and 44 regional branches, providing high‑availability, elastic scaling, and automated operations.
Value and Benefits
MySQL MGR combined with K8s/Docker offers high read/write performance, low replication lag, and robust disaster‑recovery. The cloud‑native approach improves environment stability, resource utilization, cost efficiency, and accelerates delivery cycles, while maintaining strict security and compliance for financial services.
Conclusion
By reconstructing the database stack on a cloud‑native, containerized foundation, the bank achieved higher availability, better resource efficiency, reduced operational costs, and a platform ready to adapt to future technological evolutions.
Aikesheng Open Source Community
The Aikesheng Open Source Community provides stable, enterprise‑grade MySQL open‑source tools and services, releases a premium open‑source component each year (1024), and continuously operates and maintains them.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.