
How Recursion Works — Solving Problems by Breaking Them Down
A function that calls itself. That's recursion. It sounds circular, but it works because each call makes the problem smaller, and at some point the problem is small enough to solve directly. That's the base case — the exit condition that stops the recursion.
Every recursive solution has two parts:
- Base case — the smallest problem that can be answered directly.
- Recursive case — break the problem into smaller pieces and call yourself.
How Does Recursion Work?
Factorial: n! = n × (n-1) × (n-2) × ... × 1
function factorial(n):
if n <= 1: return 1 // base case
return n * factorial(n-1) // recursive case
factorial(4) unfolds as:
factorial(4) = 4 * factorial(3)
= 4 * 3 * factorial(2)
= 4 * 3 * 2 * factorial(1)
= 4 * 3 * 2 * 1
= 24
Each call pushes a new frame onto the call stack. When the base case returns, frames pop off and results flow back up. This is why deep recursion can cause stack overflow — each frame uses stack memory, and the stack is limited (typically 1-8 MB).
Recursion vs Iteration
Any recursive algorithm can be rewritten iteratively (using a loop and an explicit stack). Any iterative algorithm can be rewritten recursively. They're equivalent in computational power.
// Iterative factorial
function factorial(n):
result = 1
for i from 2 to n:
result = result * i
return result
When to use which:
Recursion is natural for: tree traversal, graph traversal (DFS), divide and conquer (merge sort, quicksort, binary search), parsing nested structures (JSON, HTML, mathematical expressions).
Iteration is better for: simple loops, performance-critical code (avoids function call overhead), situations where recursion depth could be very large.
In practice, most developers use iteration for simple problems and recursion when the problem has a naturally recursive structure — like trees, which are defined recursively (a tree is a node with child trees).
What Is Divide and Conquer?
Divide and conquer is recursion applied to algorithm design:
- Divide the problem into smaller subproblems.
- Conquer each subproblem recursively.
- Combine the results.
Merge sort is the canonical example: divide the array in half, sort each half, merge. Binary search divides the search space in half each step. HNSW vector search navigates a layered graph by divide-and-conquer on the search space.
The complexity of divide and conquer depends on:
- How many subproblems (a)
- How much each subproblem shrinks (b)
- How much work the combine step takes
The Master Theorem gives the complexity: if you split into 2 subproblems of half the size with O(n) merge, you get O(n log n) — merge sort.
When Does Recursion Go Wrong?
Fibonacci — the classic trap:
function fib(n):
if n <= 1: return n
return fib(n-1) + fib(n-2)
This works but is O(2^n). fib(50) makes over 10^15 calls. The problem: fib(n-1) computes fib(n-2) as a subproblem, and then fib(n-2) computes it again. Massive redundant work.
The call tree for fib(5):
fib(5)
├── fib(4)
│ ├── fib(3)
│ │ ├── fib(2) ← computed here
│ │ └── fib(1)
│ └── fib(2) ← and here again
└── fib(3) ← and this whole subtree again
├── fib(2) ← and here
└── fib(1)
The fix: memoization — cache results so each subproblem is computed once.
What Is Memoization?
Store the result of each function call and return the cached result if called again with the same arguments:
cache = {}
function fib(n):
if n in cache: return cache[n]
if n <= 1: return n
result = fib(n-1) + fib(n-2)
cache[n] = result
return result
Now fib(50) computes each value once: O(n) time, O(n) space. From 10^15 operations to 50.
Memoization works when:
- The function is pure — same inputs always give the same output.
- Subproblems overlap — the same subproblem appears multiple times.
- The number of distinct subproblems is manageable.
What Is Dynamic Programming?
Dynamic programming (DP) is memoization done bottom-up instead of top-down. Instead of starting from the big problem and recursing down, start from the smallest subproblems and build up:
function fib(n):
dp = [0, 1]
for i from 2 to n:
dp[i] = dp[i-1] + dp[i-2]
return dp[n]
Same O(n) complexity, but no recursion (no stack overflow risk) and the computation order is explicit. The cache becomes an array you fill in order.
DP is used when a problem has:
- Optimal substructure — the optimal solution uses optimal solutions to subproblems.
- Overlapping subproblems — the same subproblems appear repeatedly.
Classic DP problems: shortest path (Dijkstra's and Bellman-Ford), edit distance (spell checking), knapsack (resource allocation), longest common subsequence (diff tools like git diff).
Where Does Recursion Appear?
Tree traversal — visiting every node in a tree is naturally recursive. In-order traversal of a BST gives sorted output. Pre-order traversal serializes the tree. Post-order processes children before parents (useful for deletion).
Graph DFS — depth-first search is recursion on a graph. Visit a node, then recursively visit all unvisited neighbors.
Parsing — JSON, HTML, YAML, and programming languages are all defined by recursive grammars. A JSON value is either a string, number, boolean, null, array (of JSON values), or object (of key-JSON-value pairs). Parsers mirror this recursive definition.
File system operations — rm -r recursively deletes a directory and its contents. find recursively walks a directory tree. Git recursively diffs directory trees.
Merge sort — divide the array, sort each half recursively, merge.
The Fibonacci example is everywhere in disguise — any problem where the next state depends on the previous two states (climbing stairs, tiling, counting paths) has the same structure.
Next Steps
- How Graph Algorithms Work — BFS, DFS, Dijkstra — algorithms that combine recursion with graph traversal.
- How Memory Works — the call stack that makes recursion possible (and limits it).