How Congestion Control Works — Why Networks Slow Down

2026-03-22

Congestion control prevents senders from overwhelming the network. Without it, every TCP connection would blast data as fast as possible, routers would overflow, packets would be dropped, and the network would collapse. This nearly happened in 1986.

Van Jacobson saved the internet. In October 1986, throughput between UC Berkeley and the Lawrence Berkeley Laboratory — a 400-meter link — collapsed from 32 Kbps to 40 bps. The network was in congestion collapse: every sender was retransmitting lost packets, which caused more congestion, which caused more loss, which caused more retransmissions. Jacobson designed the algorithms that fixed it. Every TCP implementation still uses descendants of his 1988 design.

What Is the Difference Between Flow Control and Congestion Control?

These are two different problems that are easy to confuse:

Flow control protects the receiver. The receiver advertises a window size: "I can accept N more bytes." If the receiver's buffer is full, the window goes to zero and the sender pauses. This prevents a fast sender from overwhelming a slow receiver.

Congestion control protects the network. The sender estimates how much data the network can handle and limits itself accordingly. No single device tells the sender to slow down — the sender infers congestion from signals like packet loss and delay.

The actual send rate is the minimum of the two: min(flow_control_window, congestion_window). Either the receiver or the network is the bottleneck — never both.

How Does Slow Start Work?

When a new TCP connection opens, the sender doesn't know the network's capacity. Sending too much risks congestion. Sending too little wastes capacity. TCP starts conservatively and ramps up.

Slow start begins with a small congestion window (typically 10 packets, or about 14 KB). For every ACK received, the window doubles. This is exponential growth — 14 KB, 28 KB, 56 KB, 112 KB — reaching full speed within a few round trips.

When the window reaches a threshold called ssthresh (slow start threshold), TCP switches to congestion avoidance — the window grows linearly instead of exponentially. One packet per round trip instead of doubling.

This two-phase approach is how every TCP connection starts. It's why the first few hundred milliseconds of a new connection feel slower than an established one — the sender is still probing the network's capacity.

What Happens When a Packet Is Lost?

Packet loss is the signal that the network is congested. When TCP detects loss, it reduces the congestion window.

There are two types of loss detection:

Timeout (severe) — the sender waited too long for an ACK. TCP assumes the network is severely congested and resets the congestion window to 1 packet. It re-enters slow start. This is the most aggressive response.

Triple duplicate ACK (mild) — the receiver keeps ACKing the same sequence number, signaling a gap. TCP halves the congestion window and enters a recovery state called fast recovery. This is less aggressive because the fact that ACKs are still arriving means the network is still working — just not at full capacity.

The pattern is: grow aggressively until loss, cut back, grow again. Over time, the congestion window oscillates around the network's capacity — the sawtooth pattern.

How Does CUBIC Work?

CUBIC is the default congestion control algorithm in Linux since 2006. It replaced TCP Reno's linear growth with a cubic function that's better suited to high-bandwidth, high-latency links (like intercontinental connections).

The key difference from Reno: after a loss event, CUBIC remembers the window size where loss occurred (W_max). Its cubic function grows slowly near W_max (where congestion previously happened) and faster away from it (where there's likely capacity to use).

This means CUBIC:

Recovers quickly after minor congestion (fast growth toward W_max)
Probes cautiously near the known congestion point (slow growth around W_max)
Explores aggressively above W_max (searching for new capacity)

CUBIC's growth depends on time since the last loss, not on round-trip time. This makes it fair across connections with different latencies — a problem that plagued earlier algorithms.

How Does BBR Work?

BBR (Bottleneck Bandwidth and Round-trip propagation time) takes a fundamentally different approach. Where Reno and CUBIC treat packet loss as the signal for congestion, BBR measures the actual bandwidth and latency of the path.

BBR maintains two estimates:

BtlBw — bottleneck bandwidth (the maximum delivery rate observed)
RTprop — round-trip propagation time (the minimum RTT observed)

The ideal send rate is BtlBw × RTprop. BBR aims for this rate, which represents the point where the pipe is full but the buffer is empty — maximum throughput with minimum latency.

BBR cycles through four phases:

Startup — like slow start, probe for bandwidth by sending faster
Drain — after startup, drain the queue that was created during probing
Probe Bandwidth — periodically send a little faster to check for increased capacity
Probe RTT — periodically reduce sending rate to measure the true minimum RTT

The result: BBR achieves 2-25x higher throughput than CUBIC on lossy networks (like mobile connections). Google reported a 4% improvement in YouTube throughput and 14% reduction in rebuffering after deploying BBR.

How Do the Algorithms Compare?

	Reno	CUBIC	BBR
Loss signal	Packet loss	Packet loss	Measured bandwidth/RTT
Recovery	Halve window	Cubic function	Model-based pacing
Best for	Low-latency LAN	High-bandwidth WAN	Lossy networks, mobile
Default in	Historical	Linux since 2006	Google servers
Weakness	Slow on long links	Still loss-based	Can be unfair to loss-based flows

In practice, most internet traffic uses CUBIC. Google's infrastructure uses BBR. QUIC connections often use BBR because QUIC controls its own congestion algorithm in userspace, independent of the OS kernel's TCP stack.

Why Does This Matter for Application Developers?

You rarely implement congestion control yourself, but understanding it explains behaviors you see every day:

Why new connections are slow — slow start means the first few hundred kilobytes are sent at a fraction of the available bandwidth. Keep-alive connections avoid this by reusing an established congestion window.

Why small files feel slow on distant servers — a 50 KB file might complete before slow start reaches full speed. The connection never gets fast. CDNs solve this by putting servers close to users.

Why loss spikes kill throughput — a single loss event halves the congestion window. On a 100 Mbps link with 100ms RTT, recovering from a loss takes seconds. This is why stable networks feel faster than nominally faster but lossy networks.

Why mobile networks feel inconsistent — mobile links have variable latency and frequent loss. Loss-based algorithms (CUBIC) interpret loss as congestion and slow down, even when the link has capacity. BBR handles this better by measuring actual bandwidth.

Next Steps

Congestion control is about managing the pipe. Certificates are about proving who's on the other end:

How Certificates Work — the trust system that makes HTTPS possible.
How TCP Works — revisit TCP with a deeper understanding of its flow and congestion control.
How QUIC Works — how QUIC implements congestion control in userspace with BBR.

How Congestion Control Works — Why Networks Slow Down

What Is the Difference Between Flow Control and Congestion Control?

How Does Slow Start Work?

What Happens When a Packet Is Lost?

How Does CUBIC Work?

How Does BBR Work?

How Do the Algorithms Compare?

Why Does This Matter for Application Developers?

Next Steps

Prerequisites

Next

References

Referenced by

How Congestion Control Works — Why Networks Slow Down

What Is the Difference Between Flow Control and Congestion Control?

How Does Slow Start Work?

What Happens When a Packet Is Lost?

How Does CUBIC Work?

How Does BBR Work?

How Do the Algorithms Compare?

Why Does This Matter for Application Developers?

Next Steps

Prerequisites

Next

Related

References

Referenced by