
How Query Execution Works — From SQL to Results
When you write SELECT * FROM orders WHERE user_id = 5 ORDER BY created_at, the database doesn't execute it literally. It parses the SQL, considers multiple execution strategies, estimates the cost of each, and picks the cheapest one. This process — query planning and optimization — is what makes databases fast despite a declarative query language.
What Are the Stages of Query Execution?
-
Parse — the SQL string is converted into a syntax tree.
SELECT,FROM,WHERE,ORDER BYbecome tree nodes. Syntax errors are caught here. -
Analyze — table names, column names, and types are resolved against the catalog (the database's metadata). "Does the table
ordersexist? Does it have a columnuser_id?" Missing tables and columns are caught here. -
Plan — the query planner generates candidate execution plans and estimates their cost. Should it use an index or a sequential scan? Which join algorithm? What order to join tables?
-
Optimize — the planner picks the plan with the lowest estimated cost.
-
Execute — the chosen plan is executed, reading data from the storage engine and returning results.
How Does the Query Planner Choose?
The planner maintains statistics about each table:
- Row count (how many rows total)
- Column statistics (distinct values, most common values, histogram of value distribution)
- Correlation (are physically adjacent rows also logically adjacent?)
Using these statistics, the planner estimates how many rows each operation will produce and how many pages must be read. It then assigns a cost (in abstract units proportional to disk I/O and CPU time).
For WHERE user_id = 5:
- Sequential scan: read all pages, filter rows. Cost = total pages.
- Index scan: traverse the B-tree index on
user_id, fetch matching rows. Cost = tree height + matching rows.
If user_id = 5 matches 10 rows out of 1 million, the index scan is vastly cheaper. If it matches 900,000 out of 1 million, the sequential scan is cheaper (reading all pages sequentially is faster than 900,000 random index lookups).
The planner makes this decision automatically based on statistics. This is why ANALYZE (updating statistics) is important — stale statistics lead to bad plans.
How Do Joins Work?
Joining two tables is the most complex operation a database performs. Three algorithms:
Nested Loop Join — for each row in the outer table, scan the inner table for matches. O(n × m) without an index. With an index on the join column of the inner table, it's O(n × log m) — the index lookup replaces the full inner scan.
Best for: small outer table joined to a large indexed inner table.
Hash Join — build a hash map of the smaller table's join column. Then scan the larger table and probe the hash map. O(n + m) — one pass through each table.
Best for: large tables without useful indexes. Requires memory to hold the hash map.
Merge Join — sort both tables by the join column, then merge them (like the merge step of merge sort). O(n log n + m log m) for the sorts, O(n + m) for the merge. If both tables are already sorted (from indexes), the sort is free.
Best for: large tables that are already sorted on the join column.
| Algorithm | Time | Memory | Best when |
|---|---|---|---|
| Nested Loop | O(n × log m) with index | O(1) | Small outer, indexed inner |
| Hash Join | O(n + m) | O(smaller table) | Large tables, no useful index |
| Merge Join | O(n + m) if pre-sorted | O(1) if pre-sorted | Both tables sorted by join key |
The planner picks the join algorithm automatically based on table sizes, available indexes, and sort order.
What Does EXPLAIN Show?
EXPLAIN reveals the query plan without executing it. EXPLAIN ANALYZE executes the query and shows actual timing:
EXPLAIN ANALYZE
SELECT o.id, u.name
FROM orders o
JOIN users u ON o.user_id = u.id
WHERE o.status = 'pending'
ORDER BY o.created_at;
Sort (cost=450..455 rows=20)
Sort Key: o.created_at
-> Hash Join (cost=10..400 rows=20)
Hash Cond: (o.user_id = u.id)
-> Bitmap Index Scan on orders_status_idx (cost=0..50 rows=20)
Index Cond: (status = 'pending')
-> Hash (cost=5..5 rows=100)
-> Seq Scan on users (cost=0..5 rows=100)
Reading bottom-up:
- Seq Scan on users — scan the entire users table (100 rows, small — no index needed).
- Hash — build a hash map of users by id.
- Bitmap Index Scan on orders_status_idx — use the index on
statusto find pending orders (20 rows). - Hash Join — match pending orders to users via the hash map.
- Sort — sort by
created_at(only 20 rows, fast in memory).
The cost numbers are estimates. EXPLAIN ANALYZE adds actual execution time. The gap between estimated and actual rows reveals stale statistics.
Common Query Performance Problems
Missing index — Seq Scan on a large table when an Index Scan is expected. Fix: add an index on the WHERE column.
Wrong join order — the planner joins tables in a suboptimal order. Rare with good statistics, but possible with complex queries involving 5+ tables. Fix: update statistics with ANALYZE, or hint the join order.
Sorting large result sets — ORDER BY without an index requires an in-memory or disk sort. If the sort exceeds work_mem, it spills to disk (dramatically slower). Fix: add an index that matches the sort order, or increase work_mem.
N+1 queries — the application runs 1 query for the list, then N queries for each item's details. Each query is fast, but 1,000 queries add up. Fix: use a JOIN or batch the lookups.
Index not used despite existing — the planner estimates a sequential scan is cheaper. This happens when the query matches too many rows (the index isn't selective enough) or when statistics are stale. Check EXPLAIN to confirm.
Next Steps
This completes the databases learning path. You now understand:
- How Storage Engines Work — how data lives on disk
- How Indexes Work — B-trees that make lookups O(log n)
- How Transactions Work — ACID guarantees
- How WAL Works — crash recovery and durability
- How Query Execution Works — from SQL to results
For related topics: How Sorting Works (merge join uses merge sort), How Hash Maps Work (hash join builds a hash map), and How Trees Work (B-trees power indexes).