Token by token. No plan. No revision. The words are the thought.
Understanding this changes how you read a model's output. Confident prose isn't evidence of certainty. Long responses aren't evidence of careful deliberation. The model is doing one thing: picking what comes next, over and over.
What determines those picks — what the model actually "knows" — is the subject of the next chapter.