Cloud Computing 19 min read

How Hengfeng Bank Built a High‑Availability OpenStack Cloud for Financial Services

This article details Hengfeng Bank's practical experience with OpenStack, covering why the bank chose the open‑source cloud platform, its multi‑site deployment architecture, high‑availability design, management practices, and lessons learned from operating a large‑scale financial cloud environment.

Efficient Ops

Feb 12, 2017

How Hengfeng Bank Built a High‑Availability OpenStack Cloud for Financial Services

1. OpenStack Status at Hengfeng Bank

Hengfeng Bank, one of China's 13 joint‑stock commercial banks, operates multiple OpenStack clusters across two regions and three data centers, supporting both production and testing environments, multi‑tenant isolation, and more than 200 applications running on over 6,000 virtual machines.

Key features include independent OpenStack instances per network zone, a hyper‑converged architecture with pure‑SSD Ceph storage, and integration with Cisco SDN for dynamic VxLAN binding and port migration.

2. Why Choose OpenStack

The bank prefers OpenStack because it is open source, vendor‑agnostic, and avoids lock‑in; the community offers a large ecosystem, mature codebase, and a robust governance model with thousands of developers worldwide.

OpenStack's licensing model reduces service‑fee costs, and its modular architecture allows the bank to customize and extend components as needed.

3. Deploying OpenStack

3.1 How Hengfeng Deploys OpenStack

The deployment uses separate control and compute nodes, with additional DHCP agents and Ceph monitors placed on compute nodes to avoid two‑layer network issues. The architecture spans two data centers with symmetric hardware for high availability, employing three control nodes for quorum and a dedicated arbitration node.

3.2 High‑Availability Applications

Control nodes run HAProxy with a primary‑plus‑two‑standby configuration and a virtual IP for API traffic. Database services use Galera clustering across three nodes, providing automatic failover without manual intervention. Memcached also operates in a primary‑standby mode.

4. Managing OpenStack

4.1 Management Approach

The bank isolates fault domains to prevent failures in one network zone from affecting others, and limits cluster size to around 1,000 physical machines for performance reasons. A single Keystone service manages multiple OpenStack clusters, and two identical Ceph clusters provide storage redundancy across data centers.

4.2 Operational Practices

Comprehensive monitoring covers network, compute, and storage layers; smokeping is used to detect latency issues. The team runs simulated banking workloads across the clouds to validate end‑to‑end functionality. Configuration management is handled with Puppet, and all OpenStack code is sourced from the upstream community via GitLab/GitHub, ensuring a consistent baseline.

5. Conclusion

The bank does not rely on any vendor‑specific OpenStack distribution; instead, it customizes the upstream community version, applying patches as needed while contributing back to the ecosystem. This approach avoids lock‑in and maintains control over the cloud stack.

Q & A

Q: How many people are involved in the OpenStack team? A: Only three to five engineers, because the bank uses a stable set of features and custom patches rather than the full upstream codebase.

Q: What typical problems have you encountered? A: Frequent hot‑migration bugs, patch management challenges, and occasional host‑level failures that require rapid remediation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

high availability Infrastructure Automation OpenStack Financial Services

Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.