How We Cut a 30‑Second API Call to Under 1 Second on 2 Million Records

In a high‑concurrency transaction system, the author diagnosed a 30‑second API latency caused by costly SQL scans and Java Map creation on over two million rows, then applied SQL aggregation, moved counting logic into PostgreSQL, and introduced a Caffeine cache, ultimately reducing the response time to under 0.8 seconds while highlighting relational‑database limits for massive data.

IoT Full-Stack Technology
IoT Full-Stack Technology
IoT Full-Stack Technology
How We Cut a 30‑Second API Call to Under 1 Second on 2 Million Records

Problem Diagnosis

Initially the interface took about 30 seconds to finish. Network and server hardware were ruled out, and timing logs showed the bottleneck was in the SQL execution that queried more than 2 million rows.

The raw MyBatis query was:

List<Map<String, Object>> list = transhandleFlowMapper.selectDataTransHandleFlowAdd(selectSql);

Running the SQL alone took only ~800 ms, indicating the database query itself was not the main culprit.

SQL Layer Analysis

Using EXPLAIN ANALYZE on the query revealed that fetching the programhandleidlist column was the most time‑consuming step.

Code Layer Analysis

In Java each row was turned into a Map, creating over two million Map objects, which dramatically slowed processing.

Optimization Measures

1. SQL Optimization

The goal was to collapse the 2 million rows into a single result using PostgreSQL's array_agg and unnest functions:

SELECT array_agg(elem) AS concatenated_array
FROM (
    SELECT unnest(programhandleidlist) AS elem
    FROM anti_transhandle
    WHERE create_time BETWEEN '2024-01-08 00:00:00.0' AND '2024-01-09 00:00:00.0'
) sub;

This aggregates the array elements into one row, reducing the amount of data transferred to the application.

2. Move Business Logic into the Database

To count occurrences of each ID, the query was rewritten to unnest the array and group by the element:

SELECT elem, COUNT(*) AS count
FROM (
    SELECT unnest(programhandleidlist) AS elem
    FROM anti_transhandle
    WHERE create_time BETWEEN '2024-01-08 00:00:00.0' AND '2024-01-09 00:00:00.0'
) sub
GROUP BY elem;

This let PostgreSQL perform the counting, cutting the Java‑side processing time.

3. Introduce Caching (Caffeine)

To avoid repeated database hits for historical dates, a local Caffeine cache was added. The cache configuration uses an LRU eviction policy with a maximum of 500 entries and a 60‑minute TTL.

<dependency>
    <groupId>com.github.ben-manes.caffeine</groupId>
    <artifactId>caffeine</artifactId>
    <version>3.1.8</version>
</dependency>
import com.github.benmanes.caffeine.cache.Caffeine;
import org.springframework.cache.CacheManager;
import org.springframework.cache.caffeine.CaffeineCacheManager;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import java.util.concurrent.TimeUnit;

@Configuration
@EnableCaching
public class CacheConfig {
    @Bean
    public CacheManager cacheManager() {
        CaffeineCacheManager cacheManager = new CaffeineCacheManager();
        cacheManager.setCaffeine(Caffeine.newBuilder()
            .maximumSize(500)
            .expireAfterWrite(60, TimeUnit.MINUTES));
        return cacheManager;
    }
}

The service method now checks the cache before querying the database for yesterday's hit rate:

@Autowired
private CacheManager cacheManager;

private static final String YESTERDAY_HIT_RATE_CACHE = "hitRateCache";

@Override
public RuleHitRateResponse ruleHitRate(LocalDate currentDate) {
    double hitRate = cacheManager.getCache(YESTERDAY_HIT_RATE_CACHE)
        .get(currentDate.minusDays(1), () -> {
            Map<String, String> hitRateList = dataTunnelClient.selectTransHandleFlowByTime(currentDate.minusDays(1));
            // further processing
            return computeHitRate(hitRateList);
        });
    return hitRate;
}

Result and Takeaways

After applying the SQL aggregation, moving counting logic into PostgreSQL, and adding Caffeine caching, the interface response time dropped from 30 seconds to under 0.8 seconds. The author notes that relational databases struggle with massive analytical workloads and suggests column‑oriented stores like ClickHouse or Hive for true millisecond‑level queries.

Relational databases excel at transactional workloads with strong consistency.

Columnar databases are better suited for large‑scale analytics and read‑heavy scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaPerformance tuningMyBatisPostgreSQLSQL optimizationCaffeine cache
IoT Full-Stack Technology
Written by

IoT Full-Stack Technology

Dedicated to sharing IoT cloud services, embedded systems, and mobile client technology, with no spam ads.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.