Backend Development 33 min read

Performance Optimization of Helios Scoring Service Using Arthas Tracing

This article documents how the Helios scoring service, which processes hundreds of thousands of data points per day, was progressively optimized from several seconds to tens of milliseconds by analyzing Arthas trace data, refactoring loops, reducing object creation, and improving date handling, ultimately revealing that database access becomes the remaining bottleneck.

Selected Java Interview Questions

Oct 24, 2023

Performance Optimization of Helios Scoring Service Using Arthas Tracing

Background

The Helios system processes a large volume of data; querying a full day's scores for all services returns 1440 minutes of scores for each application, resulting in hundreds of thousands of data points and occasional interface latency of several seconds.

This article records how to use Arthas to reduce the interface latency from hundreds of milliseconds to dozens of milliseconds.

From the trace, fetching a whole day's data takes about 300 ms on the network, while the database query itself is only 11 ms, indicating that most time is spent assembling the data in the application.

Optimization Process

The main focus is on tracing and refactoring the code rather than understanding the business logic.

Initial Unoptimized Version

Code

<private HeliosGetScoreResponse queryScores(HeliosGetScoreRequest request) { HeliosGetScoreResponse response = new HeliosGetScoreResponse(); List<HeliosScore> heliosScores = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId()); if (CollectionUtils.isEmpty(heliosScores)) { return response; } Set<String> dateSet = new HashSet<>(); Map<String, List<HeliosScore>> groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId)); for (List<HeliosScore> value : groupByAppIdHeliosScores.values()) { value.sort(Comparator.comparing(HeliosScore::getTimeFrom)); HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score(); score.setNamespace(value.get(0).getNamespace()); score.setAppId(value.get(0).getAppId()); for (HeliosScore heliosScore : value) { List<HeliosScore> splitHeliosScores = heliosScore.split(); for (HeliosScore splitHeliosScore : splitHeliosScores) { if (splitHeliosScore.getTimeFrom().compareTo(request.getStartTime()) < 0) { continue; } if (splitHeliosScore.getTimeFrom().compareTo(request.getEndTime()) > 0) { break; } dateSet.add(DateUtils.yyyyMMddHHmm.formatDate(splitHeliosScore.getTimeFrom())); if (splitHeliosScore.getScores() == null) { splitHeliosScore.setScores("100"); log.error("Missing data: {}", heliosScore); } score.add(Math.max(0, Integer.parseInt(splitHeliosScore.getScores())), null); } } response.getValues().add(score); } response.setDates(new ArrayList<>(dateSet).stream().sorted().collect(Collectors.toList())); return response; }

Arthas Trace

---ts=2021-08-17 16:28:00;thread_name=http-nio-8080-exec-10;id=81;... [trace output showing method timings] ...

Analysis

The trace shows a total of about 4 seconds, but the actual end‑to‑end latency is around 350‑450 ms; the extra time comes from Arthas itself because the traced method contains many loops, which heavily impacts performance.

The function contains three nested loops: the outer loop iterates over ~140 appIds, the middle loop over the already merged data (typically 1 entry per day), and the innermost loop over 1440 minutes.

The most expensive operation in the trace is SimpleDateFormat.formatDate().

First Optimization

Optimization Direction

Change the iteration strategy: instead of iterating over each minute, split the large merged object into many small objects and iterate over time points logically, reducing the creation of hundreds of thousands of objects.

Replace Set<String> dateSet with Set<Date> dateSet to avoid repeated formatDate() calls.

Replace repeated Integer.parseInt calls with a pre‑built Map<String, Integer> dictionary (later tests showed Integer.parseInt was still fastest).

Code

<private HeliosGetScoreResponse queryScores(HeliosGetScoreRequest request) { HeliosGetScoreResponse response = new HeliosGetScoreResponse(); List<HeliosScore> heliosScoresRecord = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId()); if (CollectionUtils.isEmpty(heliosScoresRecord)) { return response; } Set<Date> dateSet = new HashSet<>(); List<HeliosScore> heliosScores = HeliosDataMergeJob.mergeData(heliosScoresRecord); Map<String, List<HeliosScore>> groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId)); for (List<HeliosScore> scores : groupByAppIdHeliosScores.values()) { HeliosScore heliosScore = scores.get(0); HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score(); score.setNamespace(heliosScore.getNamespace()); score.setAppId(heliosScore.getAppId()); score.setScores(new ArrayList<>()); response.getValues().add(score); List<Integer> scoreIntList = HeliosHelper.splitScores(heliosScore); Calendar indexDate = DateUtils.roundDownMinute(request.getStartTime().getTime()); int index = 0; while (indexDate.getTime().compareTo(heliosScore.getTimeFrom()) > 0) { heliosScore.getTimeFrom().setTime(heliosScore.getTimeFrom().getTime() + 60_000); index++; } while (indexDate.getTime().compareTo(request.getEndTime()) <= 0 && indexDate.getTime().compareTo(heliosScore.getTimeTo()) <= 0 && index < scoreIntList.size()) { Integer scoreInt = scoreIntList.get(index++); score.getScores().add(scoreInt); dateSet.add(indexDate.getTime()); indexDate.add(Calendar.MINUTE, 1); } } response.setDates(new ArrayList<>(dateSet).stream().sorted().map(DateUtils.yyyyMMddHHmm::formatDate).collect(Collectors.toList())); return response; }

Arthas Trace

---ts=2021-08-17 14:44:11;thread_name=http-nio-8080-exec-10;id=ab;... [trace output showing ~50 ms improvement] ...

Analysis

The execution time improved by about 50 ms. The longest remaining cost is Date.compareTo inside the conditional

if (splitHeliosScore.getTimeFrom().compareTo(request.getStartTime()) < 0)

, and even simple getter calls add noticeable overhead.

Second Optimization

Optimization Direction

Replace Date objects with long timestamps for comparisons.

Replace repeated getTime()/setTime() with timestamp arithmetic, setting the Date only once.

Insert each date into Set<String> dateSet only once by using a flag.

Pre‑allocate the size of the ArrayList that stores scores after the first loop.

Code

<private HeliosGetScoreResponse queryScores(HeliosGetScoreRequest request) { HeliosGetScoreResponse response = new HeliosGetScoreResponse(); List<HeliosScore> heliosScoresRecord = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId()); if (CollectionUtils.isEmpty(heliosScoresRecord)) { return response; } Set<Date> dateSet = new HashSet<>(); boolean isDateSetInitial = false; int scoreSize = 16; List<HeliosScore> heliosScores = HeliosDataMergeJob.mergeData(heliosScoresRecord); Map<String, List<HeliosScore>> groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId)); for (List<HeliosScore> scores : groupByAppIdHeliosScores.values()) { HeliosScore heliosScore = scores.get(0); HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score(); score.setNamespace(heliosScore.getNamespace()); score.setAppId(heliosScore.getAppId()); score.setScores(new ArrayList<>(scoreSize)); response.getValues().add(score); List<Integer> scoreIntList = HeliosHelper.splitScores(heliosScore); long indexDateMills = request.getStartTime().getTime(); int index = 0; long heliosScoreTimeFromMills = heliosScore.getTimeFrom().getTime(); while (indexDateMills > heliosScoreTimeFromMills) { heliosScoreTimeFromMills += 60_000; index++; } heliosScore.getTimeFrom().setTime(heliosScoreTimeFromMills); long requestEndTimeMills = request.getEndTime().getTime(); long heliosScoreTimeToMills = heliosScore.getTimeTo().getTime(); while (indexDateMills <= requestEndTimeMills && indexDateMills <= heliosScoreTimeToMills && index < scoreIntList.size()) { score.getScores().add(scoreIntList.get(index++)); if (!isDateSetInitial) { dateSet.add(new Date(indexDateMills)); } indexDateMills += 60_000; } isDateSetInitial = true; scoreSize = (int) (score.getScores().size() * 1.1); } response.setDates(new ArrayList<>(dateSet).stream().sorted().map(DateUtils.yyyyMMddHHmm::formatDate).collect(Collectors.toList())); return response; }

Arthas Trace

---ts=2021-08-17 15:20:41;thread_name=http-nio-8080-exec-7;id=aa;... [trace output showing ~80 ms improvement] ...

Analysis

The step reduces execution time by about 80 ms, leaving roughly 160 ms. The remaining hot spots are getScores, list.size(), and list.get(index), which, despite doing little work, still incur method‑call overhead.

Third Optimization

Optimization Direction

Reduce list property accesses.

Replace repeated list.add calls with a single subList and addAll.

Code

<private HeliosGetScoreResponse queryScores(HeliosGetScoreRequest request) { HeliosGetScoreResponse response = new HeliosGetScoreResponse(); List<HeliosScore> heliosScoresRecord = heliosService.queryScoresTimeBetween(request.getStartTime(), request.getEndTime(), request.getFilterByAppId()); if (CollectionUtils.isEmpty(heliosScoresRecord)) { return response; } Set<Date> dateSet = new HashSet<>(); boolean isDateSetInitial = false; int scoreSize = 16; List<HeliosScore> heliosScores = HeliosDataMergeJob.mergeData(heliosScoresRecord); Map<String, List<HeliosScore>> groupByAppIdHeliosScores = heliosScores.stream().collect(Collectors.groupingBy(HeliosScore::getAppId)); for (List<HeliosScore> scores : groupByAppIdHeliosScores.values()) { HeliosScore heliosScore = scores.get(0); HeliosGetScoreResponse.Score score = new HeliosGetScoreResponse.Score(); score.setNamespace(heliosScore.getNamespace()); score.setAppId(heliosScore.getAppId()); score.setScores(new ArrayList<>(scoreSize)); response.getValues().add(score); List<Integer> scoreIntList = HeliosHelper.splitScores(heliosScore); long indexDateMills = request.getStartTime().getTime(); int index = 0; long heliosScoreTimeFromMills = heliosScore.getTimeFrom().getTime(); while (indexDateMills > heliosScoreTimeFromMills) { heliosScoreTimeFromMills += 60_000; index++; } heliosScore.getTimeFrom().setTime(heliosScoreTimeFromMills); long requestEndTimeMills = request.getEndTime().getTime(); long heliosScoreTimeToMills = heliosScore.getTimeTo().getTime(); int scoreIntListSize = scoreIntList.size(); int indexStart = index; while (indexDateMills <= requestEndTimeMills && indexDateMills <= heliosScoreTimeToMills && index++ < scoreIntListSize) { if (!isDateSetInitial) { dateSet.add(new Date(indexDateMills)); } indexDateMills += 60_000; } score.getScores().addAll(scoreIntList.subList(indexStart, index - 1)); isDateSetInitial = true; scoreSize = (int) (score.getScores().size() * 1.1); } response.setDates(new ArrayList<>(dateSet).stream().sorted().map(DateUtils.yyyyMMddHHmm::formatDate).collect(Collectors.toList())); return response; }

Arthas Trace

---ts=2021-08-17 15:33:40;thread_name=http-nio-8080-exec-11;id=f1;... [trace output showing ~100 ms improvement] ...

Analysis

Execution time drops another ~100 ms, leaving about 60 ms. The remaining costly operations are database query, data merge, and splitting the score string into an int array.

Fourth Optimization

Optimization Direction

Fix SQL to avoid fetching an extra row, and skip merge logic when only a single record is returned.

Code

(SQL changes are not shown; the Java code remains the same but now processes fewer rows.)

Arthas Trace

---ts=2021-08-17 16:03:24;thread_name=http-nio-8080-exec-13;id=f1;... [trace output showing total latency ~25‑40 ms] ...

Analysis

The database query now dominates the latency, taking only 25‑40 ms for a full day's data.

Result

The final end‑to‑end latency is around 60 ms, with the remaining overhead mainly from converting score strings to int[] and minor date comparisons.

Conclusion

Minimize object creation. SimpleDateFormat is expensive. Date.compareTo incurs noticeable cost.

Even trivial calls like list.size() and list.add() add up when executed millions of times.

Effective profiling tools such as Arthas are essential to identify real bottlenecks; initial assumptions about object creation were misleading.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Java Performance Optimization Backend Development Profiling Arthas

Written by

Selected Java Interview Questions

A professional Java tech channel sharing common knowledge to help developers fill gaps. Follow us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.