Rate limiting can happen at multiple layers:
Client-side: Prevents accidental floods. Easily bypassed by malicious users. Good for reducing unnecessary requests.
API Gateway: Centralized enforcement. Handles authentication and rate limiting together. Most common approach.
Application layer: Business logic aware. Can rate limit specific actions differently from general API calls.
Database layer: Last line of defense. Prevents resource exhaustion.
Typically you want API Gateway for general limits, plus application-layer limits for expensive operations like password resets or payment processing.