How Transactions Work — ACID and Database Correctness

How Transactions Work — ACID and Database Correctness

2026-03-23

A bank transfer moves $100 from Alice to Bob. Two operations: deduct from Alice, add to Bob. If the system crashes between the two operations, Alice loses $100 and Bob gets nothing. This is why transactions exist.

A transaction groups multiple operations into a single logical unit: either all operations succeed, or none of them happen. No partial states. No in-between.

What Is ACID?

ACID is the set of guarantees a transaction provides:

Atomicity — all operations in the transaction succeed or all are rolled back. "Atomic" means indivisible — there's no state where half the operations completed. If the system crashes mid-transaction, the database recovers to the state before the transaction started.

Consistency — the transaction moves the database from one valid state to another. Constraints (foreign keys, unique constraints, check constraints) are enforced. A transaction that would violate a constraint is rejected.

Isolation — concurrent transactions don't interfere with each other. Transaction A doesn't see Transaction B's uncommitted changes. The result is the same as if transactions ran one after another — even though they actually run in parallel.

Durability — once a transaction is committed, it survives crashes. The data is written to disk (via WAL) before the commit returns. Power failure, kernel panic, hardware failure — committed data persists.

How Does Isolation Work?

Isolation is the hardest property to implement because databases serve many clients simultaneously. Without isolation, concurrent transactions can produce anomalies:

Dirty read — Transaction A reads data that Transaction B has written but not yet committed. If B rolls back, A read data that never existed.

Non-repeatable read — Transaction A reads a row, Transaction B modifies it and commits, then A reads it again and gets a different value. The same query returns different results within the same transaction.

Phantom read — Transaction A queries "all users where age > 18", Transaction B inserts a new user with age 20 and commits, then A runs the same query and gets an extra row that wasn't there before.

Lost update — Transactions A and B both read a value, both modify it, both write it back. One update overwrites the other. This is the race condition of databases.

What Are Isolation Levels?

SQL defines four isolation levels, each preventing more anomalies:

LevelDirty readsNon-repeatable readsPhantom readsPerformance
Read UncommittedPossiblePossiblePossibleFastest
Read CommittedPreventedPossiblePossibleFast
Repeatable ReadPreventedPreventedPossibleModerate
SerializablePreventedPreventedPreventedSlowest

Read Committed is the default in PostgreSQL. Each query sees only committed data, but consecutive queries within a transaction may see different committed data.

Repeatable Read is the default in MySQL InnoDB. A transaction sees a snapshot of the database as it was at the transaction start. Changes committed by other transactions after that point are invisible.

Serializable guarantees the result is identical to running all transactions one after another. The strongest guarantee, but also the most expensive — it may reject transactions that would conflict.

Most applications use Read Committed or Repeatable Read. Serializable is used when correctness is critical and the workload can tolerate retries.

How Does MVCC Work?

Multi-Version Concurrency Control (MVCC) is how PostgreSQL, MySQL, Oracle, and most modern databases implement isolation without locking every row.

Instead of blocking readers when a writer is active, MVCC keeps multiple versions of each row:

  1. When Transaction A updates a row, the old version is kept. Transaction B still sees the old version (based on B's snapshot).
  2. When A commits, new transactions see A's version. B continues seeing the old version until B ends.
  3. When no transaction needs the old version anymore, it's garbage collected (VACUUM in PostgreSQL).

The result: readers don't block writers, writers don't block readers. Only writers block other writers (when both modify the same row). This is how databases achieve high concurrency.

The cost: dead row versions accumulate until vacuumed. PostgreSQL's VACUUM process cleans them up. If VACUUM falls behind, table bloat increases and query performance degrades.

What Is a Deadlock in Databases?

The same deadlock problem from threads appears in databases:

Transaction A: UPDATE accounts SET balance = balance - 100 WHERE id = 1;
Transaction B: UPDATE accounts SET balance = balance - 50 WHERE id = 2;
Transaction A: UPDATE accounts SET balance = balance + 100 WHERE id = 2;  -- waits for B
Transaction B: UPDATE accounts SET balance = balance + 50 WHERE id = 1;   -- waits for A
-- Deadlock: A waits for B, B waits for A

Databases detect deadlocks automatically and kill one of the transactions (the victim), which gets rolled back. The application should retry the transaction.

Prevention: access rows in a consistent order (always lock account with the lower ID first), or use shorter transactions that hold locks briefly.

What Does BEGIN / COMMIT / ROLLBACK Do?

BEGIN;
  UPDATE accounts SET balance = balance - 100 WHERE id = 1;
  UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;

BEGIN — starts a transaction. All subsequent operations are part of it. COMMIT — makes all changes permanent. After commit, the changes are durable. ROLLBACK — undoes all changes since BEGIN. The database returns to the pre-transaction state.

If the application crashes between BEGIN and COMMIT, the database automatically rolls back the transaction on recovery. This is atomicity.

Transactions Beyond SQL

The concept extends beyond traditional databases:

Redis transactionsMULTI/EXEC groups commands. All execute atomically, but there's no rollback — if one fails, the rest still execute.

Message queues — consuming a message from a queue and writing to a database should be atomic. If the write fails, the message should not be acknowledged. Outbox patterns and exactly-once delivery address this.

Distributed transactions — when a transaction spans multiple databases or services, coordination is much harder. Two-phase commit (2PC) and saga patterns address this. Covered in distributed systems.

Next Steps