Sharding makes single-key lookups fast. But what about queries that span shards?
"Find all orders over " must query ALL shards and merge results. Slow.
"Join users and orders" is even harder if they're on different shards.
Strategies:
- Denormalize to avoid joins
- Keep related data on the same shard (user and their orders)
- Accept that some queries will be slow
- Use a data warehouse for analytical queries
When designing a sharded schema, think about your most common queries and ensure they hit single shards.