Databases 37 min read

Understanding and Optimizing PostgreSQL VACUUM and AUTOVACUUM

This article explains the purpose and mechanics of PostgreSQL's VACUUM and AUTOVACUUM processes, demonstrates practical examples of table bloat, transaction ID handling, and provides step‑by‑step guidance for precise triggering and performance optimization of vacuum operations.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Understanding and Optimizing PostgreSQL VACUUM and AUTOVACUUM

1. Introduction

VACUUM is a core mechanism of PostgreSQL's MVCC implementation that ensures the database runs correctly. The article demonstrates why VACUUM is needed, how to trigger AUTOVACUUM precisely, and how to tune related parameters.

2. Test Environment Preparation

All examples are run on PostgreSQL 11. The following objects are created:

postgres=# CREATE USER alvin WITH PASSWORD 'alvin';
CREATE ROLE
postgres=# CREATE DATABASE alvindb OWNER alvin;
CREATE DATABASE
postgres=# \c alvindb
You are now connected to database "alvindb" as user "postgres".
alvindb=# CREATE SCHEMA alvin AUTHORIZATION alvin;
CREATE SCHEMA
alvindb=# \c alvindb alvin
You are now connected to database "alvindb" as user "alvin".
alvindb=> SHOW search_path;
"$user", public

3. Why PostgreSQL Needs VACUUM

PostgreSQL uses MVCC, storing old row versions instead of physically deleting rows. Deleting a row only sets flag fields t_xmax , t_infomask , and t_infomask2 . This design avoids costly I/O but leaves dead tuples that can cause bloat.

Examples:

alvindb=> CREATE TABLE tb_test_vacuum (
    test_id BIGSERIAL PRIMARY KEY,
    test_num BIGINT
);
CREATE TABLE
alvindb=> INSERT INTO tb_test_vacuum(test_num) SELECT gid FROM generate_series(1,3,1) gid;
INSERT 0 3
alvindb=> DELETE FROM tb_test_vacuum WHERE test_id = 2;
DELETE 1
alvindb=> SELECT * FROM tb_test_vacuum ORDER BY 1 DESC LIMIT 5;
 test_id | test_num 
---------+----------
       1 |        1

Inspecting the underlying page with the pageinspect extension shows that the row is still present but marked as dead.

4. VACUUM Operation

Running VACUUM VERBOSE tb_test_vacuum; removes dead tuples and also cleans up index bloat.

alvindb=> VACUUM VERBOSE tb_test_vacuum;
INFO:  vacuuming "alvin.tb_test_vacuum"
INFO:  scanned index "tb_test_vacuum_pkey" to remove 2 row versions
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO:  "tb_test_vacuum": removed 2 row versions in 1 pages
...

After VACUUM, the freed space can be reused by new rows, though it is not returned to the OS unless VACUUM FULL is used.

5. AUTOVACUUM Overview

PostgreSQL runs an autovacuum launcher process (default autovacuum = on ) that periodically spawns autovacuum workers . The launcher sleeps for autovacuum_naptime (default 1 min) between checks.

Workers decide whether to VACUUM or ANALYZE based on thresholds:

vacuum threshold = vacuum_base_threshold + vacuum_scale_factor * number_of_tuples
analyze threshold = analyze_base_threshold + analyze_scale_factor * number_of_tuples

Relevant statistics are stored in pg_stat_user_tables ( n_dead_tup , n_mod_since_analyze ) and pg_class.reltuples .

6. Precise Triggering of AUTOVACUUM

By adjusting configuration (e.g., autovacuum_naptime = 5s , lower thresholds) and using a helper query, the exact number of rows needed to trigger AUTOVACUUM can be calculated:

SELECT ... autovacuum_vacuum_scale_factor, autovacuum_vacuum_threshold,
       floor(autovacuum_vacuum_scale_factor * reltuples) + 1 + autovacuum_vacuum_threshold AS autovacuum_vacuum_trigger,
       ...

Running inserts, updates, or deletes until the calculated rows_to_delete_before_auto_vacuum reaches zero causes an automatic vacuum. The same logic applies to auto_analyze .

7. Table‑Level AUTOVACUUM Settings

Parameters can be set per table to fine‑tune behavior, for example:

ALTER TABLE pgbench_accounts SET (autovacuum_vacuum_scale_factor = 0.1, autovacuum_vacuum_threshold = 2000);
ALTER TABLE pgbench_accounts SET (autovacuum_analyze_scale_factor = 0.05, autovacuum_analyze_threshold = 2000);

This allows large tables to vacuum more aggressively while keeping small tables at default settings.

8. VACUUM Parameter Optimization

Key parameters affecting performance:

maintenance_work_mem (manual VACUUM) / autovacuum_work_mem (AUTOVACUUM)

vacuum_cost_delay and vacuum_cost_limit (or their autovacuum equivalents)

Increasing memory and cost limits while reducing delay dramatically speeds up vacuuming, as shown by benchmark results (e.g., 304 s → 18 s for AUTOVACUUM on a 30 M‑row table).

9. Manual VACUUM ANALYZE

Manual runs can be tuned per session:

SET vacuum_cost_delay = 0;
VACUUM ANALYZE pgbench_accounts;

Increasing maintenance_work_mem from 64 MB to 128 MB reduced VACUUM time from ~23 s to ~18 s.

10. Preventing Transaction‑ID Wraparound

When autovacuum_freeze_max_age is reached, PostgreSQL forces an aggressive vacuum even if autovacuum is disabled. An idle transaction can block this process, so monitoring backend_xmin and terminating idle‑in‑transaction sessions is essential.

autovacuum_freeze_max_age = 100000  -- for testing
BEGIN;  -- creates idle transaction
SELECT txid_current();
-- later, when table age exceeds the limit, PostgreSQL logs:
LOG:  automatic aggressive vacuum of table "..." (to prevent wraparound)

11. Conclusion

The article combines theory and hands‑on experiments to explain why VACUUM is necessary, how MVCC works in PostgreSQL, and how to tune VACUUM, AUTOVACUUM, and related parameters for optimal performance while avoiding wraparound issues.

SQLPostgreSQLDatabase TuningAUTOVACUUMVACUUM
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.