Sixteen years in cloud automation: what AI is changing now
Sixteen years in: vCAC, vRA, AWS, OpenStack, Kubernetes, Terraform, now AI agents. The honest read on what persists across the eras, idempotency, declarative state, policy-as-code, observability, and what's genuinely different about agents.
Sixteen years. That's the number I had to actually sit down and count, because the eras blur together when you're inside them. vCAC. vRA. The AWS infrastructure-as-code build-outs. The OpenStack years that I'd rather not revisit but learned more from than from anything that worked smoothly. Kubernetes. Terraform. And now AI agents, which is the era I'm in and which is the reason I'm writing this piece.
The question I keep coming back to, and that comes up with people who knew me through one of the earlier eras, mostly, is some version of: how much of this is new? Is AI agents actually a different thing, or is it the same automation problem with a fancier interface? And the answer that I've been working toward, in pieces, across a lot of essays over the last two years, is the one I want to put in one place now.
A lot of it is not new. Some of it genuinely is. The trick is knowing which is which.
What persists
Start with the things that survived every era. Not because the technology was the same (the technology was completely different across vCAC and OpenStack and Kubernetes) but because the problems they were solving were the same problems and the principles for solving them well kept turning out to be the same principles.
Idempotency. The single most load-bearing concept in automation, from the first vCAC blueprint I ever wrote to the agent workflows I'm building today. The operation has to be safe to run twice. It has to converge to the same end state regardless of where it started. You can violate this principle and ship things that work, but you cannot violate this principle and ship things that survive contact with the failure modes that automation exists to handle. Every era taught the same lesson here. Every era had people who learned it the hard way.
Declarative state. Tell the system what you want, not how to get there. vCAC blueprints, CloudFormation templates, Helm charts, Terraform modules, different syntaxes, same idea. The decade-long shift from imperative scripts to declarative specifications is one of the most important things that happened in infrastructure, and it happened because declarative state made everything else tractable: drift detection, rollback, diffing, review. You can't run a real engineering org on imperative automation. You can pretend to, for a while.
Policy-as-code. This one took longer to crystallize but it's been there the whole time. From the vCAC approval workflows that nobody called policy-as-code but functioned as policy-as-code, through OPA and Sentinel and Kyverno, the principle is: the rules that govern what automation is allowed to do should themselves be code, version-controlled, reviewed, auditable. The alternative is rules-in-someone's-head, which doesn't survive turnover and doesn't survive scale.
Observability. The least glamorous of the four, and the one most consistently underinvested in across every era. You cannot operate what you cannot see. You cannot improve what you cannot measure. Every era I worked in had teams that nailed observability and teams that didn't, and the teams that didn't always paid for it in incidents and in the slow, grinding kind of debt that nobody books on the balance sheet but everyone feels.
Those four. Idempotency, declarative state, policy-as-code, observability. If you understand those, you can drop into any of the eras I've worked in and be useful inside a week. They're the foundation.
What changed at the surface
Each era had a surface story that felt revolutionary at the time and looks, in retrospect, like a delivery mechanism for the same underlying ideas.
vCAC and vRA were the "self-service IT" story, give developers a portal, let them request infrastructure, automate the back-end fulfillment. It was a real change for the orgs that adopted it. The principles underneath were idempotency and declarative blueprints and approval-policy-as-code. The portal was the interface; the discipline was the substance. I wrote about this in vCAC was actually preparing me for this and I'll keep coming back to it because the parallel to agents is exact.
AWS infrastructure automation took the same principles and dropped them into a much larger primitive set. The API surface was enormous compared to a private cloud. The cost model was new. The economics of "infrastructure as a line item that scales with usage" was new. The discipline was the same.
OpenStack was the era I learned the most painful lessons in. The promise was open-source private cloud parity with AWS. The reality was a multi-component system where the integration surface ate most of the value. The lesson I took from OpenStack (and I think a lot of people my age took the same lesson) is that the foundation matters more than the surface, and a beautiful API on top of an unreliable foundation is a worse system than a clunky API on top of a reliable one. That lesson is going to matter for agents, and I'll come back to it.
Kubernetes and Terraform are the era I've spent the most time in. Both of them are, at their core, declarative-state engines with strong opinions about reconciliation. The Kubernetes control loop is the cleanest expression of "declarative state plus continuous reconciliation" that the industry has produced. Terraform is the cleanest expression of "declarative state plus plan-then-apply." They're different tools for different problem shapes, and the principles underneath are the same principles that vCAC was reaching for in 2010.
What's genuinely new with AI
Now the actual question. What does agent-based automation change?
A few things. Let me try to be honest about which ones are real and which ones are surface noise.
The first real change is that the automation can now operate over unstructured inputs. Every prior era of automation required the input to be structured. A vCAC blueprint took a structured request. A Terraform module took typed variables. The system could be wired up to a service catalog that mapped a user's intent into structured fields, but the automation itself ran on structure. Agents change that. An agent can take a free-text incident description and turn it into a structured remediation. It can take a customer email and turn it into a workflow. This is genuinely new, and it widens the surface of what automation can absorb by an order of magnitude. I wrote about the implications in what cloud automation taught me about agents.
The second real change is that the automation can carry context across steps in a way that prior automation could not. A Terraform run is stateless within the run. A Kubernetes controller is stateless within the reconciliation. Agents maintain working context (they can reason about what they just did, why it didn't work, what to try next) in a way that previous automation could only simulate by being plainly programmed to. The reasoning is real, when it works. It's also fragile and non-deterministic and harder to test, which is the cost.
The third real change is the labor model. This is the one that gets the most attention and that I've written about most carefully, because the framing matters. AI-driven automation is doing work that previously required humans. Not all of it. Not as fast as the breathless coverage suggests, in some categories. Faster than the dismissers want to admit, in others. I'm fine with the automation of IT systems work, that's been my career, and the displacement of work that should have been automated decades ago is overdue, not threatening. The lines I'd draw, and that I've written about at length, are creatives and the cognitive process as IP. The point for this essay is just: the labor model is the part that's actually different, and it's different because the surface of what automation can do is wider, not because the underlying principles changed.
That's what's real. Now the things that aren't:
The principles haven't changed. Idempotency, declarative state, policy-as-code, observability, every one of those applies to agents, and the teams that try to skip them because "the agent will figure it out" are walking into the same failure modes that vCAC teams walked into in 2010 and OpenStack teams walked into in 2014. The agent that retries a non-idempotent operation is the same problem as the vRA workflow that retries a non-idempotent operation. The agent that has no observability is the same problem as the Kubernetes cluster that has no observability. The fact that the build is a language model instead of a state machine does not change the principles. It changes some of the details, which I'll write about in another piece.
The need for human judgment in the loop hasn't gone away. I covered the framework for this in an AI workforce and the operational implications in treating an AI like an employee. The honest version: agents are powerful, agents are useful, agents need supervision, and the supervision model that actually works looks more like managing a junior engineer than like deploying a service. That's not a regression. It's the reality of where the technology is.
Audit doesn't get easier. If anything it gets harder, which I wrote about in an auditor walks into an AI shop. The trail of "what did the system do and why" is messier when part of the system is a model. The discipline of producing a clean audit trail has to be designed in, exactly the way it had to be designed in for every prior era.
The synthesis
Here's the honest read after sixteen years.
The thing that's most consistent across the eras is that people who internalized the principles outperformed people who learned the tools. vCAC engineers who understood declarative state and idempotency moved cleanly into AWS and then into Kubernetes. People who learned vCAC as a product, without the principles, struggled in every transition. The same pattern is replaying now with agents. The engineers who treat agent orchestration as an extension of the automation discipline they already have (same principles, new foundation) are shipping things that work. The engineers who treat it as a new field with new rules are rediscovering, slowly and painfully, the principles that the field already knew.
What's new is real and worth taking seriously. The unstructured input surface, the context-carrying execution, the labor-model implications, those are genuinely different. They deserve careful thought, and I'll keep writing about them.
What's not new is also real, and the part of this essay I most want to land. The principles persist. They persist because they're principles. The foundation changes; the foundation keeps changing; the discipline that keeps the foundation honest is the same discipline it's been the whole time.
Sixteen years in. That's the read. The tools change. The job doesn't.