The "long-running agent" problem nobody is solving
Agents that run for ten minutes mostly work. Agents that run for ten hours mostly don't. The middle is where the actual production work lives, and the patterns we have for it are weaker than the conversation pretends.