Performance Optimization of Helios Scoring Service Using Arthas Tracing
This article documents how the Helios scoring service, which processes hundreds of thousands of data points per day, was progressively optimized from several seconds to tens of milliseconds by analyzing Arthas trace data, refactoring loops, reducing object creation, and improving date handling, ultimately revealing that database access becomes the remaining bottleneck.
Background
The Helios system processes a large volume of data; querying a full day's scores for all services returns 1440 minutes of scores for each application, resulting in hundreds of thousands of data points and occasional interface latency of several seconds.
This article records how to use Arthas to reduce the interface latency from hundreds of milliseconds to dozens of milliseconds.
From the trace, fetching a whole day's data takes about 300 ms on the network, while the database query itself is only 11 ms, indicating that most time is spent assembling the data in the application.
Optimization Process
The main focus is on tracing and refactoring the code rather than understanding the business logic.
Initial Unoptimized Version
Code
heliosScores = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId()); if (CollectionUtils.isEmpty(heliosScores)) { return response; } Set
dateSet = new HashSet<>(); Map
> groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId)); for (List
value : groupByAppIdHeliosScores.values()) { value.sort(Comparator.comparing(HeliosScore::getTimeFrom)); HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score(); score.setNamespace(value.get(0).getNamespace()); score.setAppId(value.get(0).getAppId()); for (HeliosScore heliosScore : value) { List
splitHeliosScores = heliosScore.split(); for (HeliosScore splitHeliosScore : splitHeliosScores) { if (splitHeliosScore.getTimeFrom().compareTo(request.getStartTime()) < 0) { continue; } if (splitHeliosScore.getTimeFrom().compareTo(request.getEndTime()) > 0) { break; } dateSet.add(DateUtils.yyyyMMddHHmm.formatDate(splitHeliosScore.getTimeFrom())); if (splitHeliosScore.getScores() == null) { splitHeliosScore.setScores("100"); log.error("Missing data: {}", heliosScore); } score.add(Math.max(0, Integer.parseInt(splitHeliosScore.getScores())), null); } } response.getValues().add(score); } response.setDates(new ArrayList<>(dateSet).stream().sorted().collect(Collectors.toList())); return response; }Arthas Trace
---ts=2021-08-17 16:28:00;thread_name=http-nio-8080-exec-10;id=81;... [trace output showing method timings] ...Analysis
The trace shows a total of about 4 seconds, but the actual end‑to‑end latency is around 350‑450 ms; the extra time comes from Arthas itself because the traced method contains many loops, which heavily impacts performance.
The function contains three nested loops: the outer loop iterates over ~140 appIds, the middle loop over the already merged data (typically 1 entry per day), and the innermost loop over 1440 minutes.
The most expensive operation in the trace is SimpleDateFormat.formatDate() .
First Optimization
Optimization Direction
Change the iteration strategy: instead of iterating over each minute, split the large merged object into many small objects and iterate over time points logically, reducing the creation of hundreds of thousands of objects.
Replace Set dateSet with Set dateSet to avoid repeated formatDate() calls.
Replace repeated Integer.parseInt calls with a pre‑built Map dictionary (later tests showed Integer.parseInt was still fastest).
Code
heliosScoresRecord = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId()); if (CollectionUtils.isEmpty(heliosScoresRecord)) { return response; } Set
dateSet = new HashSet<>(); List
heliosScores = HeliosDataMergeJob.mergeData(heliosScoresRecord); Map
> groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId)); for (List
scores : groupByAppIdHeliosScores.values()) { HeliosScore heliosScore = scores.get(0); HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score(); score.setNamespace(heliosScore.getNamespace()); score.setAppId(heliosScore.getAppId()); score.setScores(new ArrayList<>()); response.getValues().add(score); List
scoreIntList = HeliosHelper.splitScores(heliosScore); Calendar indexDate = DateUtils.roundDownMinute(request.getStartTime().getTime()); int index = 0; while (indexDate.getTime().compareTo(heliosScore.getTimeFrom()) > 0) { heliosScore.getTimeFrom().setTime(heliosScore.getTimeFrom().getTime() + 60_000); index++; } while (indexDate.getTime().compareTo(request.getEndTime()) <= 0 && indexDate.getTime().compareTo(heliosScore.getTimeTo()) <= 0 && index < scoreIntList.size()) { Integer scoreInt = scoreIntList.get(index++); score.getScores().add(scoreInt); dateSet.add(indexDate.getTime()); indexDate.add(Calendar.MINUTE, 1); } } response.setDates(new ArrayList<>(dateSet).stream().sorted().map(DateUtils.yyyyMMddHHmm::formatDate).collect(Collectors.toList())); return response; }Arthas Trace
---ts=2021-08-17 14:44:11;thread_name=http-nio-8080-exec-10;id=ab;... [trace output showing ~50 ms improvement] ...Analysis
The execution time improved by about 50 ms. The longest remaining cost is Date.compareTo inside the conditional if (splitHeliosScore.getTimeFrom().compareTo(request.getStartTime()) < 0) , and even simple getter calls add noticeable overhead.
Second Optimization
Optimization Direction
Replace Date objects with long timestamps for comparisons.
Replace repeated getTime()/setTime() with timestamp arithmetic, setting the Date only once.
Insert each date into Set dateSet only once by using a flag.
Pre‑allocate the size of the ArrayList that stores scores after the first loop.
Code
heliosScoresRecord = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId()); if (CollectionUtils.isEmpty(heliosScoresRecord)) { return response; } Set
dateSet = new HashSet<>(); boolean isDateSetInitial = false; int scoreSize = 16; List
heliosScores = HeliosDataMergeJob.mergeData(heliosScoresRecord); Map
> groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId)); for (List
scores : groupByAppIdHeliosScores.values()) { HeliosScore heliosScore = scores.get(0); HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score(); score.setNamespace(heliosScore.getNamespace()); score.setAppId(heliosScore.getAppId()); score.setScores(new ArrayList<>(scoreSize)); response.getValues().add(score); List
scoreIntList = HeliosHelper.splitScores(heliosScore); long indexDateMills = request.getStartTime().getTime(); int index = 0; long heliosScoreTimeFromMills = heliosScore.getTimeFrom().getTime(); while (indexDateMills > heliosScoreTimeFromMills) { heliosScoreTimeFromMills += 60_000; index++; } heliosScore.getTimeFrom().setTime(heliosScoreTimeFromMills); long requestEndTimeMills = request.getEndTime().getTime(); long heliosScoreTimeToMills = heliosScore.getTimeTo().getTime(); while (indexDateMills <= requestEndTimeMills && indexDateMills <= heliosScoreTimeToMills && index < scoreIntList.size()) { score.getScores().add(scoreIntList.get(index++)); if (!isDateSetInitial) { dateSet.add(new Date(indexDateMills)); } indexDateMills += 60_000; } isDateSetInitial = true; scoreSize = (int) (score.getScores().size() * 1.1); } response.setDates(new ArrayList<>(dateSet).stream().sorted().map(DateUtils.yyyyMMddHHmm::formatDate).collect(Collectors.toList())); return response; }Arthas Trace
---ts=2021-08-17 15:20:41;thread_name=http-nio-8080-exec-7;id=aa;... [trace output showing ~80 ms improvement] ...Analysis
The step reduces execution time by about 80 ms, leaving roughly 160 ms. The remaining hot spots are getScores , list.size() , and list.get(index) , which, despite doing little work, still incur method‑call overhead.
Third Optimization
Optimization Direction
Reduce list property accesses.
Replace repeated list.add calls with a single subList and addAll .
Code
heliosScoresRecord = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId()); if (CollectionUtils.isEmpty(heliosScoresRecord)) { return response; } Set
dateSet = new HashSet<>(); boolean isDateSetInitial = false; int scoreSize = 16; List
heliosScores = HeliosDataMergeJob.mergeData(heliosScoresRecord); Map
> groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId)); for (List
scores : groupByAppIdHeliosScores.values()) { HeliosScore heliosScore = scores.get(0); HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score(); score.setNamespace(heliosScore.getNamespace()); score.setAppId(heliosScore.getAppId()); score.setScores(new ArrayList<>(scoreSize)); response.getValues().add(score); List
scoreIntList = HeliosHelper.splitScores(heliosScore); long indexDateMills = request.getStartTime().getTime(); int index = 0; long heliosScoreTimeFromMills = heliosScore.getTimeFrom().getTime(); while (indexDateMills > heliosScoreTimeFromMills) { heliosScoreTimeFromMills += 60_000; index++; } heliosScore.getTimeFrom().setTime(heliosScoreTimeFromMills); long requestEndTimeMills = request.getEndTime().getTime(); long heliosScoreTimeToMills = heliosScore.getTimeTo().getTime(); int scoreIntListSize = scoreIntList.size(); int indexStart = index; while (indexDateMills <= requestEndTimeMills && indexDateMills <= heliosScoreTimeToMills && index++ < scoreIntListSize) { if (!isDateSetInitial) { dateSet.add(new Date(indexDateMills)); } indexDateMills += 60_000; } score.getScores().addAll(scoreIntList.subList(indexStart, index - 1)); isDateSetInitial = true; scoreSize = (int) (score.getScores().size() * 1.1); } response.setDates(new ArrayList<>(dateSet).stream().sorted().map(DateUtils.yyyyMMddHHmm::formatDate).collect(Collectors.toList())); return response; }Arthas Trace
---ts=2021-08-17 15:33:40;thread_name=http-nio-8080-exec-11;id=f1;... [trace output showing ~100 ms improvement] ...Analysis
Execution time drops another ~100 ms, leaving about 60 ms. The remaining costly operations are database query, data merge, and splitting the score string into an int array.
Fourth Optimization
Optimization Direction
Fix SQL to avoid fetching an extra row, and skip merge logic when only a single record is returned.
Code
(SQL changes are not shown; the Java code remains the same but now processes fewer rows.)
Arthas Trace
---ts=2021-08-17 16:03:24;thread_name=http-nio-8080-exec-13;id=f1;... [trace output showing total latency ~25‑40 ms] ...Analysis
The database query now dominates the latency, taking only 25‑40 ms for a full day's data.
Result
The final end‑to‑end latency is around 60 ms, with the remaining overhead mainly from converting score strings to int[] and minor date comparisons.
Conclusion
Minimize object creation.
SimpleDateFormat is expensive.
Date.compareTo incurs noticeable cost.
Even trivial calls like list.size() and list.add() add up when executed millions of times.
Effective profiling tools such as Arthas are essential to identify real bottlenecks; initial assumptions about object creation were misleading.
Selected Java Interview Questions
A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.