What is a Service Mesh
A service mesh is a dedicated infrastructure layer for managing communication between microservices. It handles routing, load balancing, authentication, encryption, observability, and retries — all without requiring changes to application code.
How it works
A service mesh deploys a sidecar proxy next to each service instance. All inbound and outbound traffic flows through this proxy. The proxy handles TLS encryption (mutual TLS), retries, timeouts, circuit breaking, and metrics collection.
A control plane (e.g., Istio, Linkerd) configures the proxies and collects telemetry. The data plane (e.g., Envoy proxies) handles the actual traffic.
When Service A calls Service B, the request goes through A's sidecar proxy, which encrypts it, adds tracing headers, and routes it to B's sidecar proxy. B's proxy decrypts the request and forwards it to B. Neither service implements any of this logic.
Why it matters
In a microservices architecture with dozens of services, implementing retries, circuit breakers, mutual TLS, and distributed tracing in every service is duplicated effort. A service mesh centralizes these cross-cutting concerns in the infrastructure layer.
The tradeoff is operational complexity. Running a service mesh adds latency (traffic goes through two proxies per hop), consumes resources (every service gets a sidecar), and requires expertise to configure and debug.
Service meshes make the most sense at scale — organizations with many services, strict security requirements (mutual TLS everywhere), and the operations team to run the mesh infrastructure.