Cloud Native 16 min read

How Ping An Built a Scalable Mesos + Marathon PaaS for Financial Services

This article details how Ping An Technology designed and operated the Padis platform—an internal Docker‑based PaaS that evolved to Mesos + Marathon, providing independent container IPs, automated networking, load balancing, CMDB integration, monitoring, and log aggregation to support thousands of containers in production financial applications.

Efficient Ops
Efficient Ops
Efficient Ops
How Ping An Built a Scalable Mesos + Marathon PaaS for Financial Services

1. The Growth Story of the Padis Platform

Padis (Ping An Distribution) is an internally developed Docker‑based PaaS platform built on Mesos + Marathon. Its key features include container creation and scaling, dynamic independent IP per container, soft load‑balancing, and automated operations such as domain resolution, CMDB entry, and monitoring configuration.

2. From Docker to Mesos + Marathon

Early on, rapid environment provisioning required manual OS and middleware installation, leading to high operational risk. Docker introduced portable images, but single‑host Docker lacked clustering, health checks, and independent IPs, which caused compatibility issues with traditional three‑tier financial applications.

To address these gaps, Padis adopted Mesos + Marathon, enabling cluster‑wide resource management, rapid scaling, health monitoring, placement constraints, event subscriptions, and extensions like Mesos‑DNS and Mesos‑LB.

3. Adapting the MM Framework for Traditional Applications

Financial applications heavily depend on fixed IP addresses, which Docker’s default bridge network could not provide. Padis implemented an independent IP solution by extending the network module: using Linux bridge or OpenVSwitch to assign VLAN‑specific IPs to containers, and automating IP allocation via an Ansible‑driven message center.

The platform manages an IP pool with three states (allocated, free, reserved) and automatically reassigns IPs on container restart, ensuring firewall rules remain stable.

Load‑balancing integrates hardware (F5) and software (LVS, HAProxy) solutions, with soft load‑balancers deployed as containers that automatically update backend configurations during scaling events.

4. Expanding Platform Automation

Padis adds CMDB integration, automatically syncing IP information to Ping An’s central CMDB on every scale‑out or scale‑in.

Monitoring covers application performance via Zabbix, Mesos/Marathon component metrics, and host CPU/memory usage.

Log aggregation is achieved by mounting shared storage into containers; logs are then forwarded to the internal ELK platform for analysis and alerting.

Additional features include WebSSH, one‑click environment cloning, and NAS storage mounting.

5. Outcomes and Production Impact

Today the platform runs two production clusters (internal ServerFarm and external DMZ) each with ~1000 CPU cores, >6 TB memory, and capacity for 3000‑4000 containers. Critical financial services such as Ping An WiFi, Gold Manager, and others operate on this infrastructure, handling high‑traffic events like a fund‑sale promotion that reached 1 billion yuan in sales within a minute.

cloud nativeDevOpsPaaSMesosContainer OrchestrationMarathon
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.