What is a Load Balancer
A load balancer is a device or software that distributes incoming traffic across multiple backend servers. Clients send requests to the load balancer, and the load balancer forwards each request to one of the available servers.
How it works
The load balancer uses an algorithm (round-robin, least connections, consistent hashing) to choose which server receives each request. It monitors server health and removes unhealthy servers from the pool. When a server recovers, it is added back.
Load balancers operate at Layer 4 (TCP — forwarding connections based on IP and port) or Layer 7 (HTTP — routing based on URLs, headers, and cookies).
Where it is used
nginx, HAProxy, AWS ALB/NLB, GCP Cloud Load Balancing, and Envoy are common implementations. Nearly every production web application sits behind a load balancer.
Why it matters
Without a load balancer, a single server is a single point of failure and a capacity bottleneck. Load balancers enable horizontal scaling (add more servers to handle more traffic) and high availability (traffic is routed away from failed servers automatically).
For algorithms, health checks, and implementation details, see How Load Balancing Works.