Just to be clear

Don't worry if backpropagation and gradient descent felt like a lot. The words aren't important. You don't need to remember them.

Here's all that actually matters: when a network gets something wrong, there's a method for figuring out which weights were responsible — and nudging them in a better direction. Do that enough times, across enough examples, and the network improves.

The machine isn't being told what to do. It's figuring it out, one mistake at a time.

That's the whole idea. Everything else is detail.