Backpropagation computes gradients of loss with respect to all weights using the chain rule.
Forward pass: Compute activations, store intermediate values.
Backward pass: Starting from loss, propagate gradients backward:
Computational graph: Each operation is a node. Gradients flow backward along edges.
Interview question: "Implement backprop for a -layer network."
Compute output gradient, propagate through activation, compute weight gradients, repeat for previous layer.