TL;DR — Quick Summary

Master PostgreSQL VACUUM and autovacuum tuning to eliminate table bloat, prevent transaction ID wraparound, and maximize database performance.

PostgreSQL VACUUM is one of the most misunderstood yet critical maintenance operations in any PostgreSQL deployment. Without proper tuning, tables silently accumulate dead tuples that bloat storage, degrade query performance, and — in the worst case — trigger a catastrophic transaction ID wraparound that brings the entire database offline. This guide covers everything you need to configure autovacuum correctly and keep production databases healthy.

Prerequisites

  • PostgreSQL 13 or later (most examples apply from PostgreSQL 10+)
  • psql or a SQL client with DBA-level access
  • Ability to edit postgresql.conf or use ALTER SYSTEM
  • Basic familiarity with PostgreSQL storage concepts

Why VACUUM Exists: MVCC and Dead Tuples

PostgreSQL uses Multi-Version Concurrency Control (MVCC) to handle concurrent reads and writes without locking. Every UPDATE or DELETE does not modify data in place — instead, it marks the old row version as dead and writes a new version. This makes reads non-blocking, but it leaves behind dead tuples: row versions that are no longer visible to any transaction but still occupy space on disk.

Over time, dead tuples cause three serious problems:

  1. Table bloat: The physical table file grows even if logical row count stays constant. A table with 10 million live rows might occupy the space of 50 million rows on disk.
  2. Index bloat: B-tree indexes accumulate pointers to dead tuples, increasing their size and scan cost.
  3. Transaction ID wraparound: PostgreSQL uses 32-bit transaction IDs. After roughly 2 billion transactions, IDs wrap around. PostgreSQL forces a database-wide emergency stop to prevent data corruption if VACUUM cannot freeze old transactions in time.

VACUUM solves all three by scanning table pages, marking dead tuples as reusable space, and advancing the freeze horizon for old transactions.

VACUUM Types Compared

CommandLockSpace ReclaimUse Case
VACUUMNone (shares with reads/writes)Marks space for reuse, does not shrink fileRoutine maintenance, autovacuum
VACUUM ANALYZENoneSame + updates planner statisticsAfter bulk inserts/deletes
VACUUM FULLExclusive (table locked)Rewrites table, returns space to OSOne-time bloat recovery
VACUUM FREEZENoneForces freeze of all old XIDsWraparound prevention

Use VACUUM FULL only as a one-time recovery step on severely bloated tables — it holds an exclusive lock that blocks all reads and writes for the duration. For online table rewrites, use pg_repack instead.

Autovacuum Daemon: Core Parameters

Autovacuum runs as a background daemon that launches worker processes to vacuum tables automatically. The key parameters in postgresql.conf:

# How often the launcher wakes up to check for work
autovacuum_naptime = 1min

# Maximum concurrent autovacuum workers
autovacuum_max_workers = 3

# Trigger vacuum when dead tuples > threshold + scale_factor * n_live_tup
autovacuum_vacuum_threshold = 50
autovacuum_vacuum_scale_factor = 0.2       # 20% of table

# Trigger ANALYZE when changed tuples exceed this
autovacuum_analyze_threshold = 50
autovacuum_analyze_scale_factor = 0.1      # 10% of table

# Freeze rows this many transactions old
autovacuum_freeze_max_age = 200000000

The critical tuning point is autovacuum_vacuum_scale_factor. The default of 0.2 means autovacuum triggers when 20% of a table’s rows are dead. On a 100-row table that is fine. On a 50-million-row table, autovacuum will not trigger until 10 million dead tuples have accumulated — a severe bloat problem.

Recommended global values for high-write systems:

autovacuum_vacuum_scale_factor = 0.05
autovacuum_analyze_scale_factor = 0.02
autovacuum_max_workers = 5
autovacuum_naptime = 30s

Per-Table Autovacuum Overrides

The most powerful tuning technique is setting autovacuum parameters per table, overriding global defaults only where needed:

-- For a very high-write table (e.g., an events/audit log)
ALTER TABLE events SET (
    autovacuum_vacuum_scale_factor = 0.01,
    autovacuum_vacuum_threshold = 500,
    autovacuum_analyze_scale_factor = 0.005,
    autovacuum_vacuum_cost_delay = 2
);

-- For a nearly-static lookup table (vacuum less aggressively)
ALTER TABLE country_codes SET (
    autovacuum_vacuum_scale_factor = 0.5,
    autovacuum_vacuum_threshold = 1000
);

Per-table settings are stored in pg_class.reloptions and survive restarts. View them with:

SELECT relname, reloptions
FROM pg_class
WHERE reloptions IS NOT NULL;

Monitoring Table Bloat

Using pg_stat_user_tables

-- Top 20 tables by dead tuple count
SELECT
    schemaname,
    relname AS table_name,
    n_live_tup,
    n_dead_tup,
    round(n_dead_tup::numeric / NULLIF(n_live_tup + n_dead_tup, 0) * 100, 1) AS dead_pct,
    last_vacuum,
    last_autovacuum,
    last_analyze,
    last_autoanalyze
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 20;

Detecting Tables Approaching Wraparound

-- Tables with the oldest unfrozen transaction IDs
SELECT
    schemaname,
    relname,
    age(relfrozenxid) AS xid_age,
    pg_size_pretty(pg_total_relation_size(oid)) AS table_size
FROM pg_class
JOIN pg_namespace ON pg_namespace.oid = relnamespace
WHERE relkind = 'r'
ORDER BY age(relfrozenxid) DESC
LIMIT 20;

Any table with xid_age above 1.5 billion needs immediate attention. PostgreSQL will force autovacuum on any table above autovacuum_freeze_max_age (default 200 million), but that safety net can fail if autovacuum is being blocked by long-running transactions.

Monitoring Active Vacuum Progress

SELECT
    p.pid,
    p.relid::regclass AS table_name,
    p.phase,
    p.heap_blks_total,
    p.heap_blks_scanned,
    p.heap_blks_vacuumed,
    p.num_dead_tuples,
    a.query_start
FROM pg_stat_progress_vacuum p
JOIN pg_stat_activity a ON a.pid = p.pid;

Measuring Actual Bloat with pgstattuple

The pgstattuple extension gives exact bloat measurements:

CREATE EXTENSION IF NOT EXISTS pgstattuple;

SELECT
    table_len,
    tuple_count,
    tuple_len,
    dead_tuple_count,
    dead_tuple_len,
    round(dead_tuple_percent, 1) AS dead_pct,
    free_space,
    round(free_percent, 1) AS free_pct
FROM pgstattuple('public.orders');

A dead_tuple_percent above 10–15% on a frequently queried table is a signal to investigate why autovacuum is not keeping up.

Transaction ID Wraparound Prevention

Wraparound is PostgreSQL’s most dangerous failure mode. The sequence of events:

  1. At autovacuum_freeze_max_age (200M XID), autovacuum starts aggressive freezing on the table.
  2. At vacuum_freeze_min_age (50M XID) from the oldest unfrozen XID, VACUUM starts freezing rows.
  3. At 1 billion XIDs before wraparound, PostgreSQL starts logging warnings in the server log.
  4. At 10 million XIDs remaining, PostgreSQL stops accepting new transactions and requires VACUUM FREEZE to recover.

Monitor database-level age:

SELECT
    datname,
    age(datfrozenxid) AS db_xid_age,
    pg_size_pretty(pg_database_size(datname)) AS db_size
FROM pg_database
ORDER BY age(datfrozenxid) DESC;

Emergency recovery if a database is near wraparound:

# Must run as superuser, expect it to take hours on large databases
vacuumdb --all --freeze --verbose --analyze

VACUUM FULL vs pg_repack

When a table is severely bloated, VACUUM FULL is the blunt instrument:

-- Blocks all reads and writes until complete
VACUUM FULL VERBOSE orders;

For production systems that cannot tolerate downtime, use pg_repack:

# Install
sudo apt install postgresql-16-repack

# Repack a single table online (no exclusive lock on the live table)
pg_repack -h localhost -U postgres -d mydb -t orders

# Repack only the indexes of a table
pg_repack -h localhost -U postgres -d mydb --only-indexes -t orders

pg_repack builds a new copy of the table in the background, keeps it in sync via triggers, then swaps it in with a brief lock at the end — typically milliseconds.

Index Bloat and REINDEX

Indexes accumulate bloat independently of the table. Check index bloat:

SELECT
    indexrelname AS index_name,
    pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
    idx_scan,
    idx_tup_read,
    idx_tup_fetch
FROM pg_stat_user_indexes
ORDER BY pg_relation_size(indexrelid) DESC
LIMIT 20;

Rebuild a bloated index without locking reads or writes:

-- Available from PostgreSQL 12+
REINDEX INDEX CONCURRENTLY idx_orders_customer_id;

-- Reindex all indexes on a table
REINDEX TABLE CONCURRENTLY orders;

Autovacuum Cost Throttling and I/O Tuning

Autovacuum deliberately throttles itself to reduce I/O impact using a cost-based delay system:

# Cost units per operation
vacuum_cost_page_hit = 1       # Page found in shared_buffers
vacuum_cost_page_miss = 10     # Page read from disk
vacuum_cost_page_dirty = 20    # Page dirtied by vacuum

# Autovacuum-specific throttle (overrides vacuum_cost_delay for autovacuum)
autovacuum_vacuum_cost_delay = 2ms   # Sleep when cost_limit is reached
autovacuum_vacuum_cost_limit = 200   # Accumulated cost before sleeping

On modern NVMe storage, autovacuum is often throttled too aggressively. Increase autovacuum_vacuum_cost_limit to 400–800 and reduce autovacuum_vacuum_cost_delay to 2ms to let it run faster without saturating I/O on spinning disks.

Common Issues and Troubleshooting

Autovacuum killed by long-running transactions: Autovacuum cannot remove dead tuples that are still visible to any open transaction. Check for stale transactions:

SELECT pid, now() - xact_start AS duration, query, state
FROM pg_stat_activity
WHERE xact_start IS NOT NULL
ORDER BY duration DESC
LIMIT 10;

If you see transactions open for hours or days, terminate them with pg_terminate_backend(pid) after confirming with the application team.

“Autovacuum: preventing wraparound” in logs: This means autovacuum has switched to emergency mode, running with maximum aggression ignoring cost throttle. Do not block or kill these vacuum processes. If they fail to complete, a manual VACUUMDB --freeze is needed.

VACUUM runs forever on large tables: Break the work into smaller operations by running VACUUM on individual partitions, or schedule maintenance windows for VACUUM FULL on tables that cannot be partitioned.

Database Size Tuning Reference

Database Sizeautovacuum_max_workersautovacuum_vacuum_scale_factorautovacuum_vacuum_cost_limit
< 10 GB3 (default)0.05200 (default)
10–100 GB4–50.02400
100 GB – 1 TB6–80.01600
> 1 TB8–10 + per-table0.005 per hot table800

Cross-Database Comparison

FeaturePostgreSQL VACUUMMySQL OPTIMIZE TABLESQL Server Index RebuildOracle Segment Shrink
Locks requiredNone (regular), Exclusive (FULL)ExclusiveOnline (Enterprise)Row exclusive
Online rewrite toolpg_repackpt-online-schema-changeOnline rebuildSHRINK SPACE COMPACT
Automaticautovacuum daemonManual or event schedulerAuto index maintenanceNot automatic
Wraparound riskYes (XID)NoNoNo
Bloat visibilitypg_stat_user_tables, pgstattupleinformation_schemasys.dm_db_index_physical_statsDBA_SEGMENTS

Summary

  • VACUUM reclaims dead tuple space from MVCC updates and deletes; without it, tables bloat and transaction ID wraparound eventually halts the database.
  • Autovacuum is the automatic solution but requires tuning — the default autovacuum_vacuum_scale_factor = 0.2 is too high for large tables.
  • Use per-table overrides with ALTER TABLE SET for high-write tables rather than changing global defaults.
  • Monitor pg_stat_user_tables for n_dead_tup and last_autovacuum, and pg_database for age(datfrozenxid).
  • Use pg_repack instead of VACUUM FULL on production tables to avoid downtime.
  • Prevent wraparound emergencies by keeping age(datfrozenxid) well below 1.5 billion on all databases.
  • Tune autovacuum_vacuum_cost_limit upward on fast storage to let autovacuum run at full speed.