What the model sees

Picture a whiteboard of fixed size. Before the model writes a single word, everything it is allowed to know has to be written on that board. There is nothing in its head to fall back on, no notebook in a drawer, no memory of yesterday. Just the board in front of it.

So what gets written there? Three things, stacked top to bottom.

At the top, the standing instructions the app sets for itself. You usually never see these. They are notes the company that built the tool pinned to the board before you arrived: "be helpful, be concise, you are a coding assistant, refuse these kinds of requests." The field calls this the system prompt. It is just text on the board, sitting above everything you say.

In the middle, the conversation so far: every message you have sent and every reply the model has given, copied out in full. This is the bulk of the board in a long chat.

At the bottom, whatever you just added: your newest message, and any document or image you handed over with it.

The model reads the whole board, top to bottom, in one pass, and writes what plausibly comes next. That is the complete picture of what it knows in the moment. This board has a name: the .

One thing to carry forward from the previous module: the board is not measured in words but in tokens, those chunks of text from the tokenizer. A big window holds many pages, sometimes a short book's worth. But big or small, it has an edge. And the edge is where things get interesting.

What the model sees

So what gets written there? Three things, stacked top to bottom.

In the middle, the conversation so far: every message you have sent and every reply the model has given, copied out in full. This is the bulk of the board in a long chat.

At the bottom, whatever you just added: your newest message, and any document or image you handed over with it.

The model reads the whole board, top to bottom, in one pass, and writes what plausibly comes next. That is the complete picture of what it knows in the moment. This board has a name: the .