Deep Learning Medium
Walk me through how backpropagation works.
The short answer
Backpropagation computes the gradient of the loss with respect to every parameter by applying the chain rule backward through the network, reusing intermediate results from the forward pass. These gradients are then used by an optimizer to update the weights via gradient descent.
How to think about it
Backpropagation computes the gradient of the loss with respect to every parameter by applying the chain rule backward through the network, reusing intermediate results from the forward pass. These gradients are then used by an optimizer to update the weights via gradient descent.