AI governance frameworks that don't make engineers quit

Most AI governance frameworks I've seen are written by people who don't ship code, for people who don't ship code, and applied to people who do. The result is friction, attrition, and shadow AI. The frameworks that actually work share a few specific properties.

Sid Smith

08 Sep 2025 • 6 min read

There's a recurring shape to the AI governance frameworks visible in public reporting and conversations with people who actually use this stuff over the past couple of years. They're long. They're written in management-consulting voice. They have RACI matrices and tier-1 / tier-2 / tier-3 review processes. They require approval workflows for things that should be self-service. They get socialized in all-hands meetings, included in onboarding, and then quietly ignored because they make daily work impossibly slow.

The shadow AI use that follows is predictable. Engineers who can't get a model approved through the official process use ChatGPT with their personal account against company data anyway. The framework's existence makes the org less safe rather than more, because the official surface is unusable and the unofficial surface has no oversight at all.

Three tiers, not thirty checkpoints.

Here's the part worth being plain about: why most frameworks fail this way, and what the properties of the ones that work share. The governance gap I keep writing about doesn't get closed by writing more elaborate frameworks; it gets closed by writing better-designed ones.

Why most frameworks fail

Three patterns I see repeatedly.

They're written by people who don't use AI tools daily. The framework gets drafted by legal, compliance, and risk teams who are smart and well-intentioned and don't know what daily AI work actually looks like. They write rules for a use case they've imagined; the rules don't match the use case engineers actually have. The mismatch produces friction the framework didn't anticipate.

They optimize for "no incidents" rather than "good outcomes." The framework is designed so that nothing bad ever happens. That goal is unachievable; chasing it produces a framework where nothing happens. Engineers route around the framework to ship work; the framework's existence becomes a tax on the work-arounds rather than a guide for the work.

They're heavy on review, light on automation. The official process requires a human-in-the-loop for too many things. The reviewers become bottlenecks. The queue grows. The work-around grows in proportion. Eventually the official process is for the few things that can't be done unofficially, and the official process can't enforce its rules on the unofficial work.

The result is the worst of both worlds: a framework that doesn't reduce risk and does reduce velocity. The engineers who care about doing the right thing get worn down by the friction; the engineers who don't care work around it. Net safety: lower than no framework.

A few properties that hold across the AI governance approaches that work in mature engineering shops.

They're written with engineering input. Not "we showed it to engineering and got their feedback." Written together. The constraints engineering knows about (latency budgets, cost budgets, what models actually fail at, which workflows need certain capabilities) shape the framework rather than getting handled as exceptions to a framework written without that knowledge.

They're scoped to actual risk, not theoretical risk. The framework tells apart high-risk uses (patient data going to an AI surface, code being deployed to production based on AI output, customer-facing AI making consequential decisions) and routine uses (engineer using AI to draft an internal email, AI summarizing a meeting transcript, AI helping debug a non-prod issue). The high-risk cases get real review. The routine cases get a lightweight policy check and otherwise get out of the way.

They lean heavily on automation. Where a policy can be enforced by code rather than by review, it is. The OPA-and-Rego pattern shows up here as a base for "policy that runs at the gateway" rather than "policy that runs at the human reviewer." Automation handles the consistent cases; humans handle the edge cases.

They make the safe path the easy path. When the official AI surface is at least as fast and convenient as the unofficial one, engineers use it. When it's slower or harder, they don't. The framework that wins designs the official surface to be the obvious default rather than expecting engineers to choose friction over velocity.

They define escalation paths clearly. When an engineer has a use case the framework doesn't cover, there's an obvious next step, file a request, get a quick triage, get a path forward in days not months. The framework doesn't try to anticipate every case; it provides a process for handling the unanticipated ones quickly.

They get revised. The first version of any AI governance framework is wrong. The framework that survives is the one that gets revised as the workloads evolve. Frameworks that are written once and frozen become legacy artifacts that nobody follows.

The minimum viable version

The shape of an AI governance framework that's lightweight enough to actually function.

A short policy document, five to ten pages, written in plain language. Covers what AI use is approved by default, what requires lightweight review, what requires real review, and what's never allowed. Examples for each category.

A list of approved tools, which AI surfaces, models, and integrations engineers can use without further approval. Updated quarterly. The list is generous enough that most use cases fit on it.

A request process for the long tail, when an engineer has a use case that doesn't fit the approved list, there's a request form, a triage SLA (target 5 business days), and a clear path to either getting it approved or getting an explanation why not.

An automated policy gate at the AI surface that handles the consistent enforcement, what data classes can go to which models, what tools are scoped per agent, what audit trails get captured. The gate enforces the policy without requiring per-call human review.

Per-quarter framework review, the framework gets revisited every quarter against actual usage data. Things that aren't being used get retired; new categories get added; the policy stays current with what engineering is actually doing.

That's the substance. Most frameworks have far more than this and accomplish less.

The specific things that don't belong

A few categories of policy that don't deserve to be in a working AI governance framework.

Mandatory-tool restrictions for personal-productivity AI. "Engineers may only use AI tool X for productivity purposes" produces shadow use of tool Y. If the use is genuinely productivity-only and the data isn't sensitive, let people pick their own. Save the central control for the cases where it matters.

Approval requirements for individual prompts. "Get every prompt reviewed before sending it to a model" is unworkable at scale. The right control is at the prompt-architecture layer (which prompts are deployed in production systems and what they do); not at the per-query layer.

Blanket prohibitions on shipping AI features. Frameworks that say "all AI features need executive approval" produce slow product cycles and shadow integration. The right control is on what data flows where, not on whether the feature exists.

Detailed prescriptions for which specific model to use. The model space changes every quarter. A framework that hardcodes "use GPT-4 for everything" is out of date in three months. The right control is on capability and risk class; specific model selection is an engineering decision.

The treat-the-AI-like-an-employee parallel

The treat-the-AI-like-an-employee discipline provides a useful framing for governance: the same kinds of management practices that work for human employees usually work for AI agents. You don't require pre-approval for every email a junior engineer sends. You do require structured onboarding, clear scope, audit trails, and a process for handling exceptions. Apply the same shape to AI use and most of the framework writes itself.

What that gets you: engineers who feel trusted to use the tools well within a defined scope, with consequences for going outside the scope, and a process for expanding the scope when warranted. The friction is calibrated to the actual risk.

What this is in service of

The goal isn't to produce zero-incident AI use. The goal is to produce AI use that's good enough that the org gets the productivity benefit and bounded enough that the risks are manageable. That's a different goal from "no incidents ever," which is the goal most frameworks are accidentally written to achieve.

The frameworks that don't make engineers quit are the ones that recognize this. They optimize for sustainable velocity within bounded risk rather than for theoretical zero-risk at any velocity cost. They get used because they're useful. They reduce risk because they're applied rather than evaded.

Worth being plain about the design properties because the next two years will see a lot more AI governance frameworks written, and most of them will be the kind that produce shadow AI rather than the kind that channel AI use safely. The shape of the ones that work is gettable; it just requires someone to write them with the people who actually use this stuff rather than at them.

Why most frameworks fail

What the ones that work share

The minimum viable version

The specific things that don't belong

The treat-the-AI-like-an-employee parallel

What this is in service of

Subscribe to Echoes of the machine