What is a Checkpoint

A checkpoint is a database operation that flushes all dirty pages from the buffer pool to the data files on disk and records the current position in the WAL. After a checkpoint completes, the database no longer needs the WAL entries before that point for crash recovery. The checkpoint is what makes the WAL bounded in size rather than growing forever.

How it works

During normal operation, writes go to the WAL and modify pages in memory. The on-disk data files may be behind -- they contain stale data for any page that was modified after the last checkpoint. If the database crashes, recovery replays the WAL from the most recent checkpoint forward to bring the data files up to date.

A checkpoint proceeds in stages:

  1. Mark the start -- record the current WAL position as the checkpoint's starting point.
  2. Flush dirty pages -- write all dirty pages from the buffer pool to the data files. This is spread over time to avoid overwhelming disk I/O (called a "spread checkpoint" or "fuzzy checkpoint").
  3. Record completion -- write the checkpoint location to a control file or WAL record. This is the new recovery starting point.

In PostgreSQL, checkpoints are triggered by three conditions: a time interval (checkpoint_timeout, default 5 minutes), a volume of WAL written (max_wal_size, default 1 GB), or an explicit CHECKPOINT command. The checkpoint_completion_target setting controls how gradually dirty pages are flushed, spreading the I/O load.

In InnoDB (MySQL), the equivalent process flushes dirty pages based on the adaptive flushing algorithm, which monitors the rate of dirty page generation and adjusts flushing speed to keep up. The checkpoint age -- the distance between the oldest unflushed change and the current WAL position -- must stay within bounds, or InnoDB stalls writes.

Why it matters

Checkpoints determine two critical things: how much WAL must be retained, and how long crash recovery takes. Infrequent checkpoints mean faster steady-state performance (fewer disk writes) but longer recovery after a crash. Frequent checkpoints mean shorter recovery but more background I/O. Tuning checkpoint intervals is a fundamental part of database administration.

See How WAL Works for the full walkthrough of write-ahead logging, checkpoints, and recovery.