Databases 13 min read

How a Nighttime MySQL Migration to RDS Triggered a CPU Crisis—and How We Fixed It

This article recounts a blockchain finance company's rapid user growth, the decision to move MySQL from Docker Swarm to Alibaba Cloud RDS, the ensuing CPU overload caused by missing indexes, and the step‑by‑step troubleshooting and optimization that restored system stability.

Efficient Ops
Efficient Ops
Efficient Ops
How a Nighttime MySQL Migration to RDS Triggered a CPU Crisis—and How We Fixed It

Event Origin

The author joined a fast‑growing blockchain finance company whose services run on Alibaba Cloud. All applications, including MySQL, were deployed in Docker Swarm on an ECS cluster. Anticipating user growth, the author realized the MySQL container would soon hit capacity limits, prompting a migration plan.

Migration Plan

Original architecture diagram:

The stack consisted of CDN → WAF → ECS cluster. Docker Swarm connected all ECS hosts, running services such as Nginx, PHP, Redis, and a single MySQL container whose data persisted via a mounted volume. PHP applications accessed MySQL inside the container.

Sample docker‑compose yaml

<code>version: "3"
services:
  ussbao:
    # replace username/repo:tag with your name and image details
    image: 隐藏此镜像信息
    deploy:
      replicas: 1
      restart_policy:
        condition: on-failure
    environment:
      MYSQL_ROOT_PASSWORD: 隐藏此信息
    volumes:
      - "/data//mysql/db1/:/var/lib/mysql/"
      - "/etc/localtime:/etc/localtime"
      - "/etc/timezone:/etc/timezone"
networks:
  default:
    external:
      name: 隐藏此信息</code>

Each database ran a single MySQL container without master‑slave or read‑write separation and without any performance tuning.

Adjusted Architecture

After migration, MySQL was moved to an RDS instance; other services such as OSS and cloud Redis were also provisioned (not shown).

Migration Process

Steps: provision RDS → backup SQL → import to RDS → modify DB configuration → test.

Plan RDS instance size and create databases/users.

Add IP whitelist for RDS access.

Use

mysqldump

to back up MySQL containers.

Import the

.sql

files into RDS.

Update PHP project DB configuration.

Clear PHP cache directories.

Perform validation tests.

Enable RDS automated backups.

The migration was executed late at night to avoid user traffic.

Snowball Effect

After migration, at the 9 am peak the application began returning request timeouts.

Emergency Handling

Investigation revealed RDS CPU at 100 % with many connections. A large table

ub_user_calculate

(over 2 million rows) lacked indexes, causing full‑table scans and connection exhaustion.

Actions taken:

Added missing indexes to the problematic table.

Killed active sessions (some could not be terminated).

Temporarily disabled the domain to stop traffic.

Restarted the RDS instance to lower CPU usage.

Re‑enabled the domain and monitored that sessions and CPU returned to normal.

Backup script used for nightly backups:

<code>#!/bin/bash
#数据库IP
dbserver='*******'
#数据库用户名
dbuser='ganbing'
#数据库密码
dbpasswd='************'
#备份数据库,多个库用空格隔开
dbname='db1 db2 db3'
#备份时间
backtime=`date +%Y%m%d%H%M`
out_time=`date +%Y%m%d%H%M%S`
#备份输出路径
backpath='/data/backup/mysql/'
logpath=''/data/backup/logs/

echo "################## ${backtime} #############################"
echo "开始备份"
#日志记录头部
echo "" >> ${logpath}/${dbname}_back.log
echo "-------------------------------------------------" >> ${logpath}/${dbname}_back.log
echo "备份时间为${backtime},备份数据库 ${dbname} 开始" >> ${logpath}/${dbname}_back.log
#正式备份数据库
for DB in $dbname; do
  source=`/usr/bin/mysqldump -h ${dbserver} -u ${dbuser} -p${dbpasswd} ${DB} > ${backpath}/${DB}-${out_time}.sql` 2>> ${backpath}/mysqlback.log;
  if [ "$?" == 0 ];then
    cd $backpath
    tar zcf ${DB}-${backtime}.tar.gz ${DB}-${backtime}.sql > /dev/null
    rm -f ${DB}-${backtime}.sql
    find $backpath -name "*.tar.gz" -type f -mtime +15 -exec rm -rf {} \; > /dev/null 2>&1
    echo "数据库 ${dbname} 备份成功!!" >> ${logpath}/${dbname}_back.log
  else
    echo "数据库 ${dbname} 备份失败!!" >> ${logpath}/${dbname}_back.log
  fi
done

echo "完成备份"
echo "################## ${backtime} #############################"</code>

Index Usage Strategy and Optimization

Creating Indexes – Best Practices

Add indexes on fields that are frequently queried but rarely updated.

Fields used in

ORDER BY

or

GROUP BY

should be indexed.

Limit the number of indexes per table to about six.

Keep indexed column length short.

Primary keys do not need a separate index.

Index columns with high selectivity.

Using Indexes – Pitfalls

A leading

%

in a

LIKE

pattern disables the index.

NULL values are excluded from indexes; avoid indexing nullable columns.

If an

OR

condition includes an unindexed column, the index on the other column is ignored.

The

!=

operator forces a full table scan.

Avoid applying functions or calculations on indexed columns.

Composite indexes require the leftmost prefix to be used in queries.

Match data types exactly to prevent implicit conversion that can bypass indexes.

Conclusion

The outage was caused by a missing index on a high‑traffic table, highlighting the importance of thorough schema reviews and proactive performance monitoring before migration. Proper table design and early detection of bottlenecks can prevent similar crises.

performanceMySQLIndex Optimizationdatabase migrationRDSDocker Swarm
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.