Testing at pipeline runtime catches some issues. Continuous monitoring catches the rest.
What to monitor:
- Row count trends (sudden drops or spikes)
- Null rate changes
- Distribution shifts (average order value changed by %)
- Freshness (last update timestamp)
Alert thresholds:
Set thresholds based on historical patterns. Too sensitive triggers false alarms. Too loose misses real problems.
Start with critical tables. Expand coverage as you learn what breaks. Document runbooks for each alert type.