What is a Cache

A cache is a temporary storage layer that holds copies of frequently accessed data in a faster medium — typically memory — to reduce the time and cost of fetching it from the original, slower source.

How it works

When data is first requested, the system fetches it from the source (database, API, disk), stores a copy in the cache, and returns it. Subsequent requests for the same data are served from the cache directly — a cache hit. If the data is not in the cache, it's a cache miss, and the system falls back to the source.

Caches exist at every layer: browser caches, CDN caches, reverse proxy caches, application caches (Redis, Memcached), and database buffer pools.

Why it matters

Caches dramatically reduce latency and load. A database query that takes 50 ms becomes a 0.5 ms Redis lookup. A CDN serves static assets from an edge server 10 ms away instead of an origin server 200 ms away.

The fundamental tradeoff is freshness vs speed. Cached data may be stale — the source has been updated, but the cache still holds the old value. Managing this tradeoff through TTL, invalidation, and cache patterns is the core challenge of caching.

For the full explanation of cache layers, patterns, and invalidation strategies, see How Caching Works.

This concept appears in

How Caching Works — Speed vs Freshness at Every Layer

Referenced by

Software Architecture FAQ

What is a Cache

How it works

Why it matters

Related

This concept appears in

Referenced by