Reactive scaling adds or removes capacity when metrics cross thresholds.
Common triggers:
- CPU utilization > %
- Memory usage > %
- Request queue depth >
- Response latency P99 > ms
Configuration:
- Scale up threshold: When to add capacity
- Scale down threshold: When to remove (usually lower)
- Cooldown period: Wait after scaling before evaluating again
Limitation: Reactive. Capacity arrives after load increases. Not fast enough for sudden spikes. Combine with predictive scaling.