What is a B-Tree
A B-tree is a self-balancing tree that stores many keys per node and stays shallow — typically 3 to 4 levels deep even with millions of entries. It was designed for systems where reading from disk is expensive, making it the standard data structure behind database indexes and file systems.
How it works
A binary search tree stores one key per node and has two children. A B-tree generalizes this: each node can hold hundreds or thousands of keys in a sorted array, with one child pointer between each pair of keys.
[30 | 70]
/ | \
[5|10|20] [40|50|60] [80|90|95]
To search for 50: start at the root, compare against 30 and 70 (50 is between them, follow the middle pointer), scan the child node and find 50. Two node reads — two disk I/O operations.
The key insight is disk alignment. A disk reads data in pages (typically 4KB-16KB). A B-tree node is sized to fit exactly one page, so reading a node means reading one page. A binary tree with a million entries is 20 levels deep (20 disk reads). A B-tree with a branching factor of 1000 is just 2-3 levels deep (2-3 disk reads). This is the difference between a query taking 20ms and 2ms.
B-trees are always balanced — insertions that would overflow a node cause it to split, and the split propagates upward if needed. Deletions that underflow a node cause merging or redistribution with siblings. The tree grows and shrinks from the root, not the leaves, so every leaf is always at the same depth.
B+ trees (the variant used by most databases) store all actual data in the leaves and keep only keys in internal nodes, allowing more keys per internal node and sequential scans across linked leaf nodes.
Why it matters
B-trees are the backbone of persistent storage. SQLite, PostgreSQL, MySQL, ext4, NTFS, and HFS+ all use B-trees or B+ trees. Every time a database query uses an index, it is traversing a B-tree. Understanding B-trees explains why database indexes make queries fast and why inserting into an indexed table has overhead.
See How Trees Work for the full progression from binary trees to B-trees.