What is backpropagation really doing? | Chapter 3, Deep learning - Summary

Summary

Backpropagation is the core algorithm behind how neural networks learn. It determines how a single training example would like to nudge the weights and biases, in terms of what relative proportions to those changes cause the most rapid decrease to the cost. A true gradient descent step would involve doing this for all tens of thousands of training examples and averaging the desired changes. However, stochastic gradient descent is commonly used where data is subdivided into mini-batches, and a step is computed for each mini-batch. Backpropagation is the algorithm for computing the gradient, and for the algorithm to work, a lot of training data is required.

Facts

1. Backpropagation is the core algorithm behind how neural networks learn.
2. Backpropagation is used to compute the negative gradient of the cost function.
3. The desired nudges for weights and biases are proportional to how sensitive the cost function is to each weight and bias.
4. Backpropagation recursively applies the same process to the relevant weights and biases that determine those values, moving backwards through the network.
5. Stochastic gradient descent is a technique used in backpropagation, randomly shuffling the training data into mini-batches to compute gradients.
6. Backpropagation requires a lot of training data.