A centralized controller creates a single point of failure. You need clustering for production deployments.
HA strategies:
Active-standby. One controller handles traffic. Standby takes over if primary fails.
Active-active. Multiple controllers share the load.
Distributed consensus. Controllers form a cluster using Raft or Paxos. Any node can handle requests.
During controller failure:
- Existing flow rules remain active
- No new rules can be installed
- Switch reconnects to backup controller
Plan for rolling upgrades to update without downtime.