
How Microservices Work — From Monoliths to Independent Services
Every system starts as one thing. A single codebase, a single deployment, a single database. That is a monolith, and it is a perfectly good starting point. The question is what happens when the system grows.
What Is a Monolith?
A monolith is a single deployable unit. All the code — user authentication, order processing, inventory management, email sending — lives in one codebase, compiles into one binary (or one deployment artifact), and runs as one process. The database is shared. A change to any part requires deploying the entire system.
Monoliths are simple. One repository to clone, one build to run, one thing to deploy. Function calls between components are fast — no network, no serialization. Transactions span the entire database because there is only one.
The problems emerge at scale. When 40 engineers work in the same codebase, every merge is a coordination exercise. When one team needs to deploy a fix, they wait for everyone else's changes to be ready. When one component needs more capacity, you scale the entire monolith because you cannot scale a single function.
What Are Microservices?
Microservices decompose the system into small, independent services. Each service:
- Owns its data. The order service has its own database. The user service has its own database. No shared tables.
- Deploys independently. Shipping a fix to the order service does not require deploying the user service.
- Communicates over the network. Services call each other via REST, gRPC, or asynchronous messaging.
- Has a focused responsibility. One service does one thing well.
The word "micro" is misleading. The size of a service is not the point. What matters is independence — each service can be developed, deployed, and scaled without coordinating with the others.
When to Split
Splitting too early creates distributed system complexity before you need it. Splitting too late creates a monolith that is painful to untangle. Here are signals that a split is overdue:
Team size. When two teams change the same code frequently and step on each other, they need separate codebases. Conway's Law is real — system architecture follows team structure.
Deploy frequency. When one team wants to deploy five times a day and another deploys weekly, coupling their deployments slows the fast team.
Scaling needs. When the search component needs 20 instances but the admin panel needs one, scaling them together wastes resources.
Failure isolation. When a bug in the reporting module crashes the entire system including the checkout flow, separation prevents blast radius from spreading.
How to Draw Boundaries
The hardest part of microservices is deciding where one service ends and another begins. The principle comes from Domain-Driven Design: bounded contexts.
A bounded context is a boundary within which a term has a specific meaning and a model is internally consistent. In an e-commerce system:
- Orders — knows about line items, totals, and fulfillment status.
- Inventory — knows about stock levels, warehouses, and reorder points.
- Users — knows about profiles, authentication, and preferences.
Each context has its own model of the world. The order service knows the user ID but not the user's email — it asks the user service when it needs it. This prevents tight coupling. If the user service changes how it stores emails, the order service is unaffected.
Bad boundaries create services that cannot function without calling five other services on every request. Good boundaries minimize cross-service calls for the most common operations.
How Services Communicate
Microservices communicate in two ways:
Synchronous — one service calls another and waits for the response. REST and gRPC are the standard choices. Simple, easy to reason about, but creates runtime dependencies. If the user service is down, the order service cannot look up user details.
Asynchronous — one service publishes an event or message, and other services consume it later. The order service emits an "order placed" event, and the inventory service processes it when ready. Services are decoupled in time. See How Event-Driven Architecture Works.
Most systems use both. Synchronous for queries that need immediate answers. Asynchronous for operations that can tolerate delay.
The Tradeoffs
Microservices solve organizational scaling problems, but they create technical ones:
Distributed transactions. A monolith wraps an operation in a database transaction. Across services, there is no single database to provide ACID. You need sagas or eventual consistency — both are harder to implement and debug.
Network latency. A function call in a monolith takes nanoseconds. A network call between services takes milliseconds. An operation that calls five services sequentially adds measurable latency.
Operational complexity. Instead of deploying one thing, you deploy dozens. Each needs monitoring, logging, alerting, and health checks. You need service discovery, load balancing, and potentially a service mesh.
Data consistency. Each service owns its data. Getting a consistent view across services requires careful design. The order service might show an order as "placed" while the inventory service hasn't decremented stock yet.
Debugging. A request flows through multiple services. Distributed tracing (OpenTelemetry, Jaeger) becomes necessary to follow a single operation across the system.
The Right Starting Point
Start with a well-structured monolith. Use clean module boundaries internally — the same bounded contexts you would use for microservices. When a specific boundary needs independent deployment, scaling, or a different technology, extract that module into a service.
This approach is sometimes called the "modular monolith" or "monolith-first." You get the simplicity of a single deployment with the option to split later. Most teams that start with microservices on day one regret it.
Next Steps
- How Event-Driven Architecture Works — decouple services with events instead of synchronous calls.
- How Consistency Works — the consistency challenges that arise when data is split across services.
- How gRPC Works — the most common protocol for synchronous service-to-service communication.
Prerequisites
Referenced by
- How Event-Driven Architecture Works — Reacting Instead of Polling
- How Load Balancing Works — Distributing Traffic Across Servers
- Software Architecture FAQ
- How CQRS Works — Separating Reads from Writes
- What is a Bounded Context
- What is a Microservice
- What is Service Discovery
- What is a Service Mesh
- What is a Saga
- What is a Sidecar Pattern
- What is a Monolith
- What is an API Gateway
- What is a Circuit Breaker
- How Pub/Sub Works — Decoupling Publishers from Subscribers