What is a Load Balancer

A load balancer is a device or software that distributes incoming traffic across multiple backend servers. Clients send requests to the load balancer, and the load balancer forwards each request to one of the available servers.

How it works

The load balancer uses an algorithm (round-robin, least connections, consistent hashing) to choose which server receives each request. It monitors server health and removes unhealthy servers from the pool. When a server recovers, it is added back.

Load balancers operate at Layer 4 (TCP — forwarding connections based on IP and port) or Layer 7 (HTTP — routing based on URLs, headers, and cookies).

Where it is used

nginx, HAProxy, AWS ALB/NLB, GCP Cloud Load Balancing, and Envoy are common implementations. Nearly every production web application sits behind a load balancer.

Why it matters

Without a load balancer, a single server is a single point of failure and a capacity bottleneck. Load balancers enable horizontal scaling (add more servers to handle more traffic) and high availability (traffic is routed away from failed servers automatically).

For algorithms, health checks, and implementation details, see How Load Balancing Works.

This concept appears in

How Load Balancing Works — Distributing Traffic Across Servers

What is a Load Balancer

How it works

Where it is used

Why it matters

Related

This concept appears in

Referenced by