What is Consistent Hashing
Consistent hashing is a key distribution technique that maps both keys and nodes onto a circular hash space (a "ring"). When a node is added or removed, only a fraction of keys need to move -- roughly K/N where K is the number of keys and N is the number of nodes. With naive hash(key) % N partitioning, adding a single node remaps nearly every key. Consistent hashing solves this problem.
How it works
Imagine a ring of values from 0 to 2^32. Each node is hashed to a position on the ring. Each key is also hashed to a position on the ring, and it is assigned to the first node encountered when walking clockwise from the key's position. This is the key's owner.
When a new node joins, it takes over a portion of the ring from its clockwise neighbor. Only the keys in that portion move. All other keys stay where they are. When a node leaves, its keys move to the next clockwise node. In both cases, only 1/N of the keys are affected.
A basic consistent hashing ring has a problem: nodes are not evenly distributed. One node might own a huge arc of the ring while another owns a sliver. The solution is virtual nodes (vnodes). Each physical node is hashed to multiple positions on the ring -- 100 or more. This spreads each node's responsibility across many small arcs, and the law of large numbers ensures an even distribution. When a node leaves, its load is distributed across many other nodes rather than dumped onto a single neighbor.
Amazon's DynamoDB, Apache Cassandra, Riak, and most distributed key-value stores use consistent hashing for partitioning. Memcached clients use it to distribute cache keys across servers. Load balancers use it to route requests from the same client to the same backend, maintaining session affinity without shared state.
Why it matters
Consistent hashing is what allows distributed systems to scale horizontally without massive data reshuffling. Adding a node to a 100-node cluster moves roughly 1% of the data instead of nearly all of it. This is the difference between a cluster that can grow seamlessly under load and one that must take downtime to rebalance.
See How Partitioning Works for the full walkthrough of hash partitioning, range partitioning, and rebalancing strategies.