Production systems use multiple cache layers:
L1: Local cache (in-process memory)
- Fastest: nanoseconds
- Limited by server RAM
- Not shared between servers
L2: Distributed cache (Redis)
- Fast: - ms
- Shared across servers
- Survives server restarts
L3: CDN (edge cache)
- For static content
- Closest to users
Request flow: Check L1 → Check L2 → Query database → Populate L2 → Populate L1
L1 reduces load on L2. L2 reduces load on the database. Each layer catches what the previous missed.