Module X

Agents

Chapter II

What Agents Can and Can't Do

Two words that sound alike but are not: capable and reliable. An agent can be wildly capable and, at the same time, not to be trusted to run unattended. Holding both of those in your head at once is the whole skill of this chapter.

On the capable side, an agent can research a question across dozens of sources, write code and run it, sort and rename a pile of files, and stitch several systems together, collapsing hours of careful human clicking into minutes.

On the unreliable side, it inherits every weakness of the model underneath, the confident wrong answers, the missing sense of its own uncertainty, and then it adds a brand new one we just met: in a loop, mistakes do not stay small. A slip at step three, built on in good faith, is a serious mess by step seven. And the model has no instinct to pause and say "wait, let me check that."

The interesting questions all live in the gap between those two sides, between what an agent can do and what you can safely let it do alone.