Temperature: how random is random?

When the model scores every possible next word, it does not always pick the top one. There is a dial that decides how adventurous the pick is, and it is called temperature. We have met a dial before: back in the learning module, a single weight was a dial you could turn to make one input matter more or less. This is a dial too, but it controls how much the model gambles.

Turn the temperature low and the model almost always takes the highest-scoring word. The writing comes out focused, safe, a little repetitive. That is what you want for a factual question with one right answer.

Turn it high and the model starts reaching for the lower-scoring words too. The writing gets more varied and surprising, sometimes genuinely creative, and sometimes it wanders off a cliff. At the lowest setting of all, zero, the model always takes the single top word, which makes it fully predictable: the same prompt gives the same answer every time.

This one dial explains a common confusion. Ask a model the same question twice and you can get two different answers. The model did not change its mind. It is sampling from a spread of possibilities, and a warm temperature let it land somewhere different the second time. It also explains why a creative writing prompt can feel alive while a factual lookup should be kept cold for reliability. The next slide lets you hold the dial yourself.

Temperature: how random is random?