Ford-Fulkerson works because of the residual graph structure. Each augmenting path increases the total flow. Backward edges allow you to undo bad decisions. When no augmenting path exists, the algorithm terminates. At that point, the flow is maximum.
The proof comes from the max-flow min-cut theorem, which I'll cover soon. The core concept: if you cannot find a path from source to sink in the residual graph, no more flow can be added. The current flow is best.