Generalization
A student who memorizes every practice exam question will fail the real exam if the questions change slightly. What you want is a student who actually understood the material — who can handle questions they've never seen.
Networks face the same problem. A network that just memorizes its training examples is useless in the real world. What you want is a network that has learned the underlying pattern — one that works on examples it has never encountered.
This is called generalization. The test is simple: hold back some of your data and don't show it to the network during training. Once training is done, test the network on the data it never saw. Good performance there means the network actually learned something. Good performance only on training data means it memorized.
There's a catch: a bigger network has more capacity, which lets it learn subtler patterns, but also lets it cheat by memorizing. Too small and it can't capture the real structure. Too large and it fits the noise.
Finding the right balance is one of the practical arts of building networks.