What is Vacuum
Vacuum is a PostgreSQL process that reclaims disk space occupied by dead tuples -- row versions that are no longer visible to any active transaction. Because PostgreSQL implements MVCC by creating a new physical copy of a row on every update or delete, old versions accumulate over time. Without vacuum, the table grows indefinitely even if you only update existing rows.
How it works
When a row is updated, PostgreSQL does not overwrite the original. Instead, it marks the old version as dead and inserts a new version. The old version must remain visible to any transaction that started before the update (that is how snapshot isolation works). Once no active transaction can see the dead version, it becomes eligible for cleanup.
Standard VACUUM scans the table, identifies dead tuples, and marks their space as reusable for future inserts. It does not return space to the operating system -- the table file stays the same size, but the space inside it is available. VACUUM FULL rewrites the entire table to a new file, reclaiming space to the OS, but it requires an exclusive lock and is expensive.
PostgreSQL runs autovacuum in the background. It monitors each table and triggers vacuum when the number of dead tuples exceeds a threshold (default: 50 rows + 20% of the table size). Autovacuum also updates the visibility map, which tracks which pages contain only tuples visible to all transactions. Index-only scans rely on this map to avoid visiting the heap.
A critical secondary role of vacuum is preventing transaction ID wraparound. PostgreSQL uses 32-bit transaction IDs. After about 2 billion transactions, IDs wrap around, and without vacuum freezing old rows, the database would consider committed data as being in the future. An aggressive autovacuum to prevent wraparound runs automatically when a table approaches the danger threshold.
Why it matters
Vacuum is not optional. Neglecting autovacuum tuning leads to table bloat (tables 10x larger than their live data), degraded query performance from scanning dead rows, and in the worst case, a forced shutdown to prevent transaction ID wraparound. Understanding vacuum is essential for operating any PostgreSQL database.
See How Transactions Work for the full walkthrough of MVCC, snapshots, and cleanup.