One word at a time

Here is the entire mechanical description of how a language model writes. It is short enough to hold in your head.

It takes everything written so far as input. It produces a score for every possible next word in its vocabulary, a measure of how likely each one is to come next. It picks one of them. It adds that word to the text. Then it starts over from the top, with the text now one word longer.

That is the whole engine. There is no planning step. No private draft. No glimpse of the finished sentence before the first word is chosen. The model commits to one word, and that word immediately becomes part of the context that shapes the next. The "word" here is really a token, the bite-sized chunk we met in the meaning-of-words module, but one word at a time is the right picture to carry.

Notice what this rules out. The model cannot look ahead and check that a sentence it is about to start will end well. It cannot reach back and revise a word once written. It only ever moves forward, one committed step at a time, like someone laying a path plank by plank with no way to pick a plank back up.

That single constraint, always forward, never back, explains a surprising amount of how these models behave, including a particular way they get themselves stuck.

One word at a time

Here is the entire mechanical description of how a language model writes. It is short enough to hold in your head.

That single constraint, always forward, never back, explains a surprising amount of how these models behave, including a particular way they get themselves stuck.