What is a Hash Map
A hash map (also called a hash table or dictionary) stores key-value pairs and retrieves any value in O(1) average time. You give it a key, it gives you the value — no scanning, no sorting, no tree traversal.
How it works
A hash map combines three components:
- An array of buckets — the underlying storage.
- A hash function — converts any key into an integer.
- A collision resolution strategy — handles cases where two keys map to the same bucket.
To store a key-value pair, the hash map passes the key through the hash function, which produces an integer. That integer is reduced to an array index (typically via modulo), and the pair is stored in that bucket.
hash("name") → 7392841 → 7392841 % 16 → bucket 9
hash("city") → 2048153 → 2048153 % 16 → bucket 9 ← collision!
Retrieval works the same way: hash the key, jump to the bucket, return the value. This is O(1) because computing the hash and indexing into the array are both constant-time operations.
When two keys land in the same bucket (a collision), the hash map needs a fallback strategy. The two most common approaches are chaining (each bucket holds a linked list of entries) and open addressing (probe nearby buckets until an empty one is found).
As the hash map fills up, collisions become more frequent and performance degrades. Most implementations resize (double the array and rehash everything) when the load factor exceeds a threshold, typically around 0.75.
Why it matters
Hash maps are arguably the most-used data structure in software. Configuration lookups, caching layers, symbol tables in compilers, counting word frequencies, removing duplicates — all hash map problems. The inverted index that powers keyword search is fundamentally a hash map from words to document lists.
Understanding how hash maps achieve O(1) and when they degrade (poor hash functions, high load factor, pathological collisions) is essential for writing performant code.
See How Hash Maps Work for the full walkthrough.