How gRPC Works — Protocol Buffers, HTTP/2, and Streaming

How gRPC Works — Protocol Buffers, HTTP/2, and Streaming

2026-03-24

gRPC is a remote procedure call framework created by Google. It uses Protocol Buffers for binary serialization, HTTP/2 for transport, and code generation for type-safe clients and servers in any language. Where REST sends JSON over HTTP/1.1, gRPC sends compact binary messages over multiplexed HTTP/2 streams.

gRPC dominates internal service-to-service communication — microservices, mobile backends, and any place where latency and bandwidth matter more than browser compatibility.

Protocol Buffers — Schema-First, Binary-Fast

Protocol Buffers (protobuf) is a serialization format. You define your data structures in a .proto file:

syntax = "proto3";

service UserService {
  rpc GetUser (GetUserRequest) returns (User);
  rpc ListUsers (ListUsersRequest) returns (stream User);
  rpc CreateUser (CreateUserRequest) returns (User);
}

message GetUserRequest {
  int32 id = 1;
}

message User {
  int32 id = 1;
  string name = 2;
  string email = 3;
  repeated string roles = 4;
}

Each field has a number (the field tag) that identifies it in the binary encoding. Fields are encoded by tag and type, not by name. The string "name" never appears in the serialized data — only the tag 2 and the value.

This gives protobuf two advantages over JSON:

  • Size — 3-10x smaller. No field names, no quotes, no braces. Numbers are varint-encoded.
  • Speed — no parsing. The binary format maps directly to memory. No string-to-number conversion, no key lookup.

Code Generation

The protoc compiler generates client and server code from the .proto file. In Go, you get a typed client:

user, err := client.GetUser(ctx, &pb.GetUserRequest{Id: 42})
fmt.Println(user.Name) // typed — no JSON unmarshaling

The generated code handles serialization, deserialization, HTTP/2 framing, and error mapping. You write the .proto file once and generate clients in Go, Rust, Java, Python, TypeScript, C++, and dozens of other languages. The schema is the contract — both sides agree at compile time.

HTTP/2 — Multiplexing and Header Compression

gRPC runs on HTTP/2, which provides:

  • Multiplexing — multiple RPC calls share a single TCP connection. No head-of-line blocking at the HTTP layer. In HTTP/1.1, each request blocks the connection until it completes. In HTTP/2, requests and responses are interleaved as frames on independent streams.
  • Header compression (HPACK) — HTTP headers are compressed with a shared dictionary. Repeated headers (like Content-Type, Authorization) are sent as small index references after the first request.
  • Flow control — per-stream and per-connection flow control prevents a fast sender from overwhelming a slow receiver.

The combination of binary protobuf payloads and HTTP/2 multiplexing makes gRPC significantly faster than REST+JSON for high-throughput internal communication.

Four Communication Patterns

gRPC supports four patterns, all defined in the .proto file:

1. Unary RPC — one request, one response. Like a normal function call. Most common pattern.

rpc GetUser (GetUserRequest) returns (User);

2. Server streaming — one request, stream of responses. The server sends multiple messages back. Useful for large result sets, event feeds, or log tailing.

rpc ListUsers (ListUsersRequest) returns (stream User);

3. Client streaming — stream of requests, one response. The client sends multiple messages. Useful for file uploads or batch data ingestion.

rpc UploadLogs (stream LogEntry) returns (UploadSummary);

4. Bidirectional streaming — stream of requests, stream of responses. Both sides send messages independently. Useful for chat, collaborative editing, or real-time sync.

rpc Chat (stream ChatMessage) returns (stream ChatMessage);

gRPC: protobuf encode → HTTP/2 stream → decode

Client generated Encode protobuf HTTP/2 stream #1 multiplexed Decode protobuf S JSON vs Protobuf on the wire JSON {"id":42,"name":"Alice"} → 27 bytes, text, parsed Proto 08 2a 12 05 41 6c 69 63 65 → 9 bytes, binary, mapped

Four communication patterns

Unary 1 req → 1 res most common Server stream 1 req → N res feeds, logs Client stream N req → 1 res uploads, batch Bidirectional N req → N res chat, sync

Error Handling

gRPC uses its own status codes, separate from HTTP:

gRPC CodeMeaningHTTP Equivalent
OKSuccess200
NOT_FOUNDResource doesn't exist404
INVALID_ARGUMENTBad request data400
UNAUTHENTICATEDMissing credentials401
PERMISSION_DENIEDNot authorized403
INTERNALServer error500
UNAVAILABLEService is down503
DEADLINE_EXCEEDEDTimeout504

Every gRPC call has a deadline — a maximum time the client is willing to wait. Deadlines propagate through call chains: if service A calls B calls C, and A's deadline is 5 seconds, C inherits the remaining time. This prevents cascading timeouts in microservice architectures.

When to Use gRPC vs REST

Use gRPC when:

  • Services talk to services (not browsers)
  • Low latency and high throughput matter
  • You want strong typing and code generation across multiple languages
  • You need streaming (server push, bidirectional communication)
  • The schema changes frequently and you want compile-time safety

Use REST when:

  • Browsers are the primary client (gRPC requires a proxy like grpc-web)
  • You need HTTP caching (CDNs, browser cache)
  • The API is public and needs broad tooling support (curl, Postman, any HTTP library)
  • Simplicity matters more than performance

Many architectures use both: REST for public-facing APIs and gRPC for internal microservice communication.

Schema Evolution

Protocol Buffers handle schema evolution through field tags. You can add new fields (old clients ignore them), and deprecate old fields (new clients ignore them), as long as you never reuse a field tag number. This is why field tags are numbers, not names — the binary format stays backward-compatible without versioning.

Next Steps