Design a rate limiter that restricts how many requests a client can make.
Functional requirements:
- Limit requests per client (by IP, user ID, or API key)
- Return Too Many Requests when limit exceeded
- Configurable limits (e.g., requests per minute)
Non-functional requirements:
- Low latency (rate checking shouldn't slow down requests)
- Distributed (works across multiple servers)
- Accurate (within a small margin of error)
Scale:
- million active users
- requests per second