What a GPU actually does
Graphics cards (GPUs) were built to solve a completely different problem: rendering images on a screen.
When your computer draws a 3D scene, it has to calculate the color and brightness of millions of pixels, all at once. The color of one pixel doesn't depend on the color of another. They can all be computed simultaneously.
A GPU is designed for exactly that. Instead of one fast processor doing things one at a time, a GPU has thousands of smaller processors, each slower on its own, but all running at the same time. Not one expert doing everything. Thousands of workers doing simpler things in parallel.
Neural network training turned out to need the same thing. Most of the multiplications and additions that happen during training are also independent of each other. They don't need to happen in sequence. They could all run at once.
The hardware that had been sitting inside gaming computers (built to make video games look good) was the exact hardware neural network researchers needed.