What is a Join

A join is a SQL operation that combines rows from two or more tables based on a related column -- typically a foreign key. When you write SELECT * FROM orders JOIN customers ON orders.customer_id = customers.id, the database finds every pair of rows where the condition matches and returns them as a single result. Joins are fundamental to relational databases because data is normalized across multiple tables.

How it works

The database's query planner chooses a physical join algorithm based on table sizes, available indexes, and sort order. The three main algorithms are:

Nested loop join. For each row in the outer table, scan the inner table for matching rows. With an index on the join column of the inner table, this is efficient for small result sets -- the index turns each inner lookup into a B-tree traversal instead of a full scan. Without an index, performance degrades to O(n * m).

Hash join. Build a hash table from the smaller table (the build side), then scan the larger table (the probe side) and look up each row in the hash table. This is the fastest algorithm for large unsorted tables without indexes. It requires enough memory to hold the hash table; if the build side exceeds available memory, the database spills to disk.

Merge join (sort-merge join). Sort both tables on the join column, then walk through both sorted lists in parallel, matching rows as they align. This is efficient when both inputs are already sorted -- for example, when reading from an index that provides the correct order.

The join type also matters: INNER JOIN returns only matching rows, LEFT JOIN returns all rows from the left table plus matches from the right, and FULL OUTER JOIN returns all rows from both tables. The physical algorithm is independent of the join type.

Why it matters

Joins are the most expensive operations in most SQL queries. Understanding which join algorithm the database chose -- and why -- is how you diagnose slow queries and decide which indexes to create. A missing index on a join column can turn a millisecond query into a minutes-long table scan.

See How Query Execution Works for the full pipeline from SQL to result set, including join ordering and optimization.