Why Is MySQL count(*) So Slow and How to Supercharge It
This article explains why MySQL count(*) queries can become a performance bottleneck on InnoDB tables and presents practical optimization techniques such as Redis caching, secondary caches, multithreading, reducing joins, and even switching to ClickHouse for massive datasets.
Preface
Recently I optimized several slow‑query interfaces at work and gathered some insights to share, hoping they will be helpful.
We use
Mysql8with the
Innodbstorage engine. Besides index optimization, the main focus was improving the performance of
count(*).
Typically a paginated API runs two queries: one to fetch the data rows and another to get the total row count, then combines the results.
Example SQL for fetching data (no performance issue):
<code>select id, name from user limit 1,20;</code>But the
count(*)query often suffers from poor performance:
<code>select count(*) from user;</code>Why does this happen?
1 Why is count(*) slow?
In MySQL,
count(*)counts the total rows in a table, and its performance depends heavily on the storage engine.
MyISAM stores the total row count on disk, so
count(*)can return the value instantly. InnoDB, however, supports transactions and MVCC, which means the row count can vary between concurrent transactions. InnoDB must scan rows and accumulate the count, leading to low efficiency, especially on large tables.
2 How to optimize count(*) performance?
We can improve it from several angles.
2.1 Add Redis cache
For simple counters such as total page views, cache the count in Redis instead of querying the database each time.
When a user opens a page, increment the cached count; the first visit sets the count to 1, subsequent visits increment it, and the value is read from Redis for display.
This avoids real‑time
count(*)queries and greatly boosts performance, though it may introduce temporary inconsistency under high concurrency.
2.2 Use a secondary cache
When data changes rarely but many count queries occur with various filter conditions, a secondary in‑memory cache (e.g., Caffeine or Guava) can store query results.
In SpringBoot, annotate the method with
@Cacheableand provide a custom
CacheKeyGeneratorto build a cache key from the query parameters.
<code>@Cacheable(value = "brand", keyGenerator = "cacheKeyGenerator")
public BrandModel getBrand(Condition condition) {
return getBrandByCondition(condition);
}
</code> <code>public class CacheKeyGenerator implements KeyGenerator {
@Override
public Object generate(Object target, Method method, Object... params) {
return target.getClass().getSimpleName() + "_" + method.getName() + "," +
StringUtils.arrayToDelimitedString(params, ",");
}
}
</code>The cached result expires after a short period (e.g., 5 minutes), reducing the need for repeated
count(*)scans.
2.3 Execute queries in parallel
For scenarios like counting orders by status, run two
count(*)queries concurrently using
CompletableFutureand combine the results, which speeds up the overall response.
<code>select count(*) from order where status = 1;
select count(*) from order where status = 0;
</code>2.4 Reduce unnecessary joins
If the required filter columns already exist in the main table, avoid joining additional tables when counting.
<code>select count(*) from product where name='test' and unit_id=123 and brand_id=124 and category_id=125;
</code>Removing extra joins can significantly improve
count(*)performance.
2.5 Switch to ClickHouse for massive data
When joins cannot be eliminated and the dataset is huge, consider replicating the data into ClickHouse, a column‑store database optimized for fast aggregations.
Use Canal to capture MySQL binlog changes and insert them into ClickHouse. Queries against ClickHouse can return counts orders of magnitude faster.
Note: When using ClickHouse, batch inserts are recommended to avoid excessive write overhead.
If query conditions are extremely diverse, ClickHouse may not be ideal; an alternative is Elasticsearch, though it also suffers from deep‑pagination issues.
macrozheng
Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.