Just to be clear
Don't worry if backpropagation and gradient descent felt like a lot. The words aren't important. You don't need to remember them.
Here's all that actually matters: when a network gets something wrong, there's a method for figuring out which weights were responsible — and nudging them in a better direction. Do that enough times, across enough examples, and the network improves.
The machine isn't being told what to do. It's figuring it out, one mistake at a time.
That's the whole idea. Everything else is detail.