Hallucinated files: a debugging chronicle
Two hours of chasing a bug that wasn't where the agent said it was, in a file the agent confidently described and that didn't exist. A close reading of one of the more useful failures of the year.
Two hours of chasing a bug that wasn't where the agent said it was, in a file the agent confidently described and that didn't exist. A close reading of one of the more useful failures of the year.
OPA and Rego had a quiet decade as the policy-as-code layer for Kubernetes and IaC pipelines. The AI agent wave is making them load-bearing in a way they weren't before. The renaissance is real and the reasons are structural.
The IDE-agent products that have stuck for me through a year of daily use share one thing: a strong plan-then-execute workflow. The tools without it produce flashier demos and worse outcomes. Worth being explicit about why this is the actual product.
The conversation history with your AI tools is a piece of infrastructure now. Treating it like ephemeral chat is fine for the casual case; treating it that way for daily work means you lose context, repeat yourself, and slowly poison your own retrieval.
The infrastructure-as-code review pipeline is one of the highest-leverage places to deploy AI in an engineering org, and almost nobody is doing it well. The reasons it's underused are mostly accidental rather than principled.
Every platform team buries the business in an eighty-nine-row YAML and calls it self-service. That isn't self-service, it's a configuration handoff. Decisions as Code is the discipline of extracting the five decisions that actually matter and letting the platform absorb the rest.
Twelve months of running agentic coding tools daily, and the failure mode that keeps repeating isn't the dramatic one. It's the quiet one, the agent doing the thing it was asked, exactly, to a result nobody wanted.
Eleven seconds, very thorough, immaculate audit trail. A walk through what one Tuesday afternoon in my homelab teaches you about working with agentic coding tools, and the workflow changes that follow.
The job of crafting clever prompts to coax better answers from a frontier model is mostly over. The job of designing how prompts compose into systems is just beginning.
Six months ago MCP was Anthropic's protocol nobody had implemented. Now it's a category every major vendor ships against. The thing nobody is asking is what that does to the protocol itself.
The framing has become a cliché. The actual mechanics, onboarding, scope of work, performance review, termination, change in interesting ways once you take the metaphor literally.
The workflow-orchestration tools of the early 2010s tried to solve a problem the LLM agents of the mid-2020s are now taking another swing at. The problem is the same. The substrate is different. Worth being honest about which parts of the previous attempt to bring forward and which to leave.