What is a Primary Key
A primary key is a constraint that guarantees every row in a table has a unique, non-null identifier. It can be a single column (id) or a composite of multiple columns ((tenant_id, order_id)). Every relational database enforces primary key uniqueness, and most use the primary key to determine how data is physically organized on disk.
How it works
When you define a primary key, the database automatically creates a unique index on that column. In InnoDB (MySQL), the primary key index is the clustered index -- the table data itself is stored in primary key order within the B-tree's leaf pages. This means range scans on the primary key are extremely fast because rows with adjacent keys are stored on the same or nearby pages.
In PostgreSQL, the table is stored as a heap (rows are not ordered), and the primary key is a regular B-tree index that points back to heap locations. There is no clustered index by default, though CLUSTER can reorder the heap once.
There are two schools of primary key design. Surrogate keys are values with no business meaning -- auto-incrementing integers or UUIDs. They never change and are compact. Natural keys use real data -- an email address, an ISBN, or a composite like (country_code, phone_number). Natural keys avoid an extra lookup but can be wider and occasionally need updating.
Sequential keys (auto-increment, ULIDs) insert at the end of the B-tree, which avoids page splits and is friendly to the storage engine. Random keys (UUIDv4) scatter inserts across the tree, causing frequent page splits and worse cache utilization in the buffer pool.
Why it matters
The primary key determines the physical layout of your data in clustered-index databases. A poor choice -- random UUIDs in InnoDB, overly wide composite keys -- leads to write amplification, bloated secondary indexes, and degraded performance that is expensive to fix after the fact.
See How Indexes Work for the full walkthrough of B-trees, clustered indexes, and secondary indexes.