Database Performance Optimization and Governance for a High‑Load Application
This technical report analyzes a high‑traffic MySQL deployment with three servers, detailing disk usage, table space, QPS and slow‑SQL issues, and presents a comprehensive governance plan that includes data migration, read‑traffic offloading to replicas, MyBatis interceptor implementation, JSF monitoring, and Python automation to reduce disk pressure and improve query performance.
Background
The core application relies on a MySQL cluster (one master and two slaves) that has been operating at high resource utilization for years, with CPU, disk, and QPS all near capacity, requiring urgent pre‑promotion optimization.
Database Overview
The cluster consists of three physical machines (master 1x.x.x.36, slaves 1x.x.x.73 and 1x.x.x.135) each with 64 CPU cores, 256 GB RAM, and 16 TB of RAID‑0 storage, running MySQL 5.5.14.
Disk Space
By the end of February, disk usage on all nodes exceeded 65 % with limited remaining space, prompting a need to free up storage through data cleanup and archiving.
Table Space
Several tables exceed 100 GB and contain billions of rows, notably the xxx_status table with 30 billion rows.
QPS Situation
The master node handles the majority of queries, reaching peak QPS of 30 k/s, while the slaves are under‑utilized.
Slow SQL
Occasional slow queries (>1 s) on both master and slaves cause disk I/O spikes and affect overall system stability.
Governance Goals
Archive or delete table space >100 GB to reduce disk usage by ~10 % (≈470 GB).
Shift high‑frequency read traffic from master to slaves to lower master QPS by ~30 %.
Eliminate slow SQL: fully remove >10 s queries, reduce >5 s queries by 80 %, and >1 s queries by 60 %.
Governance Plan
1. Large Table Data Migration
Different tables require different handling: historical tables are deleted, pure history tables are archived to a big‑data platform before deletion, and core business tables are migrated to a historical database after archiving.
import org.apache.ibatis.executor.Executor;
import org.apache.ibatis.mapping.BoundSql;
import org.apache.ibatis.mapping.MappedStatement;
import org.apache.ibatis.plugin.*;
@Intercepts({@Signature(type = Executor.class, method = "query", args = {MappedStatement.class, Object.class, RowBounds.class, ResultHandler.class})})
public class ParameterInterceptor implements Interceptor {
private final Map
mappedStatementIdMap = new HashMap<>();
@Override
public Object intercept(Invocation invocation) throws InvocationTargetException, IllegalAccessException {
Object[] queryArgs = invocation.getArgs();
MappedStatement mappedStatement = null;
if (queryArgs[0] instanceof MappedStatement) {
mappedStatement = (MappedStatement) queryArgs[0];
}
if (mappedStatement != null && mappedStatementIdMap.containsKey(mappedStatement.getId())) {
BoundSql boundSql = mappedStatement.getBoundSql(queryArgs[1]);
List
parameterMappingList = boundSql.getParameterMappings();
if (parameterMappingList.isEmpty()) {
Profiler.businessAlarm(mappedStatement.getId(), "查询参数为空");
throw new BadArgumentException("查询参数为空,请确认入参是否有值");
}
}
return invocation.proceed();
}
@Override
public Object plugin(Object target) { return Plugin.wrap(target, this); }
@Override
public void setProperties(Properties properties) {
for (Object key : properties.keySet()) {
mappedStatementIdMap.put(key, 1);
}
}
}2. Intercept Empty‑Parameter Queries
A MyBatis plugin rejects any query without parameters, preventing full‑table scans.
3. Read‑Traffic Offloading to Slaves
JSF filters record API call counts; a Python script retrieves the top‑10 most‑called methods and guides the addition of annotations to route those reads to slave replicas.
def get_jsf(start_time, end_time):
url = 'http://xxx.taishan.jd.com/api/xxx/xxx/xxx/'
params = {
'startTime': start_time,
'endTime': end_time,
'endPointKey': 'api.jsf.provider.method.count.key',
'quickTime': int((end_time - start_time) / 1000),
'markFlag': 'true',
'markLimit': 500
}
res = requests.post(url=url, data=json.dumps({}), params=params, headers=headers)
# Process response and generate Excel report
...4. Monitoring and Reporting
JSF filter implementation logs each API invocation to a monitoring system, enabling statistical analysis of call frequencies.
@Component
public class JsfInvokeFilter extends AbstractFilter {
private static final String API_PROVIDER_METHOD_COUNT_KEY = "api.jsf.provider.method.count.key";
private static final String API_CONSUMER_METHOD_COUNT_KEY = "api.jsf.consumer.method.count.key";
@Override
public ResponseMessage invoke(RequestMessage requestMessage) {
String key = RpcContext.getContext().isProviderSide() ? API_PROVIDER_METHOD_COUNT_KEY : API_CONSUMER_METHOD_COUNT_KEY;
String method = requestMessage.getClassName() + "." + requestMessage.getMethodName();
Map
tags = new HashMap<>(2);
tags.put("bMark", method);
tags.put("bCount", "1");
Profiler.sourceDataByStr(key, tags);
return getNext().invoke(requestMessage);
}
}Results
After implementing the plan, disk usage decreased by ~470 GB (10 %), master QPS peak dropped from 25 k to 17.5 k (≈30 % reduction), and slow‑SQL incidents were largely eliminated according to the defined thresholds.
Conclusion
The adopted governance strategy focused on safe, incremental improvements—data archiving, read‑traffic redistribution, and query validation—while avoiding high‑risk actions such as immediate sharding or cloud migration. Future scaling may involve cloud migration combined with sharding when traffic grows beyond current capacity.
JD Tech Talk
Official JD Tech public account delivering best practices and technology innovation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.