Data Structures FAQ

Common questions about arrays, hash maps, trees, and graphs. Each answer is short. Links go to the full explanation.

When should I use an array vs a linked list?

Arrays for random access and sequential traversal — they're cache-friendly because elements are contiguous in memory. Linked lists for frequent insertion and deletion at known positions.

In practice, dynamic arrays (Vec in Rust, ArrayList in Java, list in Python) are the default. They combine array cache performance with dynamic sizing.

See How Arrays and Linked Lists Work for the full comparison with complexity analysis.

Why is hash map lookup O(1)?

A hash function converts the key to an array index in constant time. One computation, one array access. Collisions degrade to O(n) in the worst case, but with a good hash function and reasonable load factor, most buckets have one entry.

See How Hash Maps Work for the full explanation of hashing, collision resolution, and load factor.

What is the difference between a hash map and a tree?

Hash maps give O(1) exact-match lookup but can't do range queries or sorted iteration. Trees give O(log n) lookup but support ranges (WHERE age BETWEEN 18 AND 30) and ordered traversal.

Databases support both: hash indexes for exact match, B-tree indexes for range queries.

See How Trees Work for the full comparison table.

What is a B-tree and why do databases use it?

A B-tree stores hundreds of keys per node instead of one. Each node fills a disk block (4 KB), so one disk read gives hundreds of keys. With branching factor 100 and 100 million keys, the tree has height ~4. Four disk reads to find any key.

PostgreSQL, MySQL, SQLite, and MongoDB all use B-tree variants (B+ trees) for their default indexes. File systems like ext4 and Btrfs use them too.

See How Trees Work for the full B-tree and B+ tree explanation.

What is the difference between a stack and a queue?

A stack is LIFO (last in, first out) — push and pop from the same end. The function call stack, undo history, and DFS all use stacks.

A queue is FIFO (first in, first out) — enqueue at back, dequeue from front. Job queues, message queues, and BFS all use queues.

Both are O(1) for their primary operations.

See How Arrays and Linked Lists Work for how stacks and queues are built from arrays and linked lists.

What is a graph and when do I need one?

A graph is nodes and edges representing arbitrary relationships. Use it when your data has connections that aren't hierarchical: social networks, dependencies, routing, state machines.

Trees are a special case of graphs (connected, acyclic, undirected). The internet is a graph. Git commit history is a directed acyclic graph. HNSW vector indexes are layered graphs.

See How Graphs Work for graph representation, traversal, and algorithms.