How Processes Work — Programs in Motion

2026-03-22

A program is a file on disk — compiled instructions. A process is that program running: the instructions loaded into memory, a stack for function calls, a heap for dynamic data, open file descriptors, network sockets, environment variables, and a set of CPU registers. One program can have many processes (every tab in your browser is a separate process running the same program).

The kernel manages every process on the system. It decides which process runs on which CPU core, when to switch between them, and how to keep them isolated from each other.

What Makes Up a Process?

A process is more than just code. The kernel tracks:

Component	What it holds
Memory	Text (code), data, heap, stack, memory-mapped files
PID	Process ID — a unique integer identifying the process
PPID	Parent process ID — who created this process
File descriptors	Open files, pipes, sockets (stdin=0, stdout=1, stderr=2)
CPU state	Program counter, registers, flags — saved/restored on context switch
Credentials	User ID (UID), group ID (GID) — determines permissions
Environment	Environment variables (PATH, HOME, etc.)
Signals	Pending and blocked signals
Exit status	Return code when the process terminates (0 = success)

All of this is stored in a kernel data structure (the task_struct in Linux — over 600 fields). When you run ps or top, you're reading from these structures.

How Are Processes Created?

On Unix systems (Linux, macOS), new processes are created with two system calls: fork() and exec().

fork() creates an exact copy of the current process. The child gets a copy of the parent's memory, file descriptors, and state. The only difference: fork() returns 0 to the child and the child's PID to the parent.

exec() replaces the current process's program with a new one. The PID stays the same, but the code, stack, and heap are replaced with the new program.

Together:

pid = fork()         // create child (copy of parent)
if pid == 0:         // in the child
    exec("./server") // replace child's program with server
else:                // in the parent
    wait(pid)        // wait for child to finish

This is how your shell works. When you type ls, the shell forks a child, the child execs /bin/ls, and the shell waits for it to finish.

The fork-then-exec pattern seems wasteful — why copy all the parent's memory just to replace it? Modern kernels use copy-on-write (COW): the child shares the parent's physical memory pages. Only when either process writes to a page does the kernel copy it. If the child immediately calls exec(), almost no copying happens.

How Does the Kernel Schedule Processes?

A machine with 8 CPU cores can run 8 processes truly simultaneously. But a typical system has hundreds of processes. The kernel's scheduler decides which processes run, on which cores, and for how long.

The scheduler solves a multi-objective optimization problem:

Fairness — every process should get CPU time proportional to its priority.
Responsiveness — interactive processes (your terminal, your browser) should respond within milliseconds.
Throughput — batch processes (compilation, data processing) should use CPU efficiently.
Energy — idle cores should sleep to save power.

Linux uses the Completely Fair Scheduler (CFS): each process accumulates "virtual runtime" as it runs. The process with the least virtual runtime runs next. Higher-priority processes accumulate virtual runtime slower, so they get scheduled more often.

A typical time slice is 1-10 milliseconds. After each slice, the scheduler checks if another process should run. This switching — saving one process's CPU state and loading another's — is called a context switch.

What Is a Context Switch?

When the scheduler switches from Process A to Process B:

Save A's state — CPU registers, program counter, stack pointer → stored in A's task_struct.
Switch page tables — the MMU now translates addresses using B's virtual memory mappings.
Restore B's state — load B's registers, program counter, stack pointer from B's task_struct.
Resume B — the CPU continues executing B's code as if it was never interrupted.

A context switch costs 1-10 microseconds. That sounds fast, but at thousands of switches per second, it adds up. More importantly, switching invalidates CPU caches — the new process's data isn't in L1/L2/L3 cache, so the first memory accesses after a switch are slow (cache misses). This indirect cost is often larger than the switch itself.

How Do Processes Communicate?

Processes are isolated by default — one process can't read another's memory. But they need to communicate. Common mechanisms:

Pipes — a byte stream from one process to another. ls | grep "txt" connects ls's stdout to grep's stdin through a pipe. Unidirectional, in-memory, fast.

Signals — asynchronous notifications. SIGTERM asks a process to terminate gracefully. SIGKILL forces termination (can't be caught). SIGINT is what Ctrl+C sends. SIGCHLD tells a parent its child exited.

Shared memory — two processes map the same physical memory into their address spaces. The fastest IPC mechanism — no copying, just direct memory access. Requires synchronization (mutexes, semaphores) to avoid race conditions.

Sockets — network sockets work between processes on the same machine too (127.0.0.1 or Unix domain sockets). More overhead than shared memory but well-understood and language-agnostic.

Files — the simplest mechanism. One process writes, another reads. Works across reboots. Slow for high-frequency communication.

What Happens When a Process Exits?

When a process terminates (by returning from main, calling exit(), or receiving a fatal signal):

Open file descriptors are closed — files, sockets, pipes.
Memory is released — the kernel reclaims all pages.
Children are reparented — orphaned children are adopted by PID 1 (init/systemd).
Exit status is stored — the process becomes a zombie: it exists in the process table (so the parent can read the exit status with wait()) but uses no resources.
Parent is notified — the kernel sends SIGCHLD to the parent.
Parent calls wait() — retrieves the exit status and removes the zombie.

If the parent never calls wait(), the zombie stays in the process table indefinitely. A zombie uses no memory or CPU — just a process table entry. But too many zombies can exhaust the PID space. This is why daemon processes (servers, background services) must handle SIGCHLD and reap their children.

What Is PID 1?

PID 1 is the first process the kernel starts. On modern Linux, it's systemd. On macOS, it's launchd. On containers, it's whatever the ENTRYPOINT specifies.

PID 1 has special responsibilities:

Adopts orphans — when a process's parent exits, PID 1 becomes the new parent.
Reaps zombies — PID 1 must call wait() on adopted children, or zombies accumulate.
Receives unhandled signals differently — SIGTERM and SIGINT don't kill PID 1 unless it explicitly handles them. This is why Ctrl+C sometimes doesn't stop a Docker container — the process inside is PID 1 and ignores the signal.

Understanding PID 1 behavior matters for containers, where the distinction between a proper init system and a raw application as PID 1 affects signal handling, zombie reaping, and graceful shutdown.

Next Steps

Processes run in user space. The kernel is the layer beneath them that makes it all work:

How the Kernel Works — the boundary between your code and the hardware.
How Threads Work — lightweight execution within a process.
How Memory Works — revisit memory with a deeper understanding of process isolation.

Prerequisites

How Memory Works

How the Kernel Works

References

Operating Systems: Three Easy Pieces