Single-node rate limiting is straightforward. Distributed rate limiting is hard:
Problem: User hits Server A, then Server B. Each has separate counters.
Solution : Centralized store. Use Redis. All servers check the same counter. Adds latency but ensures accuracy.
Solution : Sticky sessions. Route same user to same server. Works until that server fails.
Solution : Approximate. Each server tracks locally, sync periodically. Allows some overage but faster.
Most systems use Redis with the sliding window algorithm. The latency cost is worth the accuracy.