OPA Gatekeeper for AI Governance: The Decisions as Code Enforcement Layer

Decisions as Code specifies the standard. OPA Gatekeeper enforces it. Constraint templates as the policy definition layer. Required tags, model version pinning, namespace quotas, network policies for AI workloads. The 'specify, then verify' loop, finally complete.

OPA Gatekeeper for AI Governance: The Decisions as Code Enforcement Layer

Decisions as Code (DaC) is the methodology behind nearly every self-service and automation system I’ve designed: extract business decisions out of platform configuration into a small, curated layer, often five real decisions where the raw config exposed eighty-nine. The remaining choices get absorbed into templates and defaults the platform owns. (I called this Property Toolkit during my OneFuse days; the shape of the idea hasn’t changed, only the foundation.)

DaC has always been about specifying the decisions, defining them once, projecting them onto every consuming platform. The other half of the loop is verifying that the specifications are honored at runtime. The two halves are different activities, different tools, different points in the lifecycle.

This piece is about the second half. OPA Gatekeeper is the enforcement layer for DaC in the Kubernetes era. The K8s ecosystem has matured into a clean two-layer pattern. Helm values (or whatever centralized-decisions mechanism you prefer) as the specification layer, OPA Gatekeeper as the admission-time enforcement layer. The methodology DaC has always asked for is finally complete in a single foundation.

This is the imprint version of that argument. DaC is the specification half. OPA Gatekeeper is the verification half. Both are necessary. Neither is sufficient alone. The 2025 version of the methodology lives at the intersection.

The two halves of the loop

The pattern is older than software. You define a decision surface; you enforce that the deployed reality matches it. Two distinct activities, two different tools, two different points in the lifecycle.

DaC has always covered the specification half cleanly. T-shirt sizing defined once, projected onto every consuming platform. Naming standards defined once, applied everywhere. Tagging conventions defined once, propagated through every workload. Nested composition so that changing the standard OS or sizing definition propagates to every dependent application.

What the specification side doesn’t do is enforce that the deployed object actually matches the decision at the runtime layer. If a developer hand-edits a deployment to skip the standard tags, the specification layer has no mechanism to catch it. The enforcement complement has to live somewhere else. In the older substrates, that somewhere else was an approval policy or a post-provisioning workflow, useful but heavy.

In the K8s era, the enforcement complement is OPA Gatekeeper. And the integration with the specification layer is much cleaner.

What OPA Gatekeeper actually is

OPA Gatekeeper is the K8s-native packaging of OPA. It runs as an admission controller, every API request that creates or modifies a K8s object is intercepted before the object is persisted, the request is evaluated against a set of policies, and the request is allowed or denied. Allow means the object lands. Deny means the API client gets an error explaining which constraint was violated.

The two CRDs that matter:

  • ConstraintTemplate. The policy definition. A Rego rule that takes inputs and emits violations. The template is parameterized, it doesn’t hardcode the values it checks against; it takes them as parameters from the Constraint that instantiates it.
  • Constraint. An instance of a ConstraintTemplate, with the parameters filled in. “Apply the required-labels template, with cost-center and owner as the required labels.” Constraints can be scoped, apply only to a namespace, only to certain Kinds, only to certain labels.

The pattern is the same shape DaC has always used, declare the policy class once, instantiate it per consumer with platform-correct parameters. ConstraintTemplate is the class; Constraint is the instance.

The four AI-specific Gatekeeper patterns

The patterns I keep recommending for AI workloads in 2025 (and the ones that show up in the homelab cluster on engine-01) are these four.

Required tags

Every AI workload needs a stable set of tags for cost attribution, ownership, classification, and the eval scorecard linkage. The ConstraintTemplate enforces that the labels exist; the Constraint declares which labels are required for which workload class.

The integration with the specification layer: the standard label list lives in the standards library chart (the Helm values article covered the chart shape). The Gatekeeper Constraint’s parameters.labels value is generated from the same standard list at deploy time. Add a required label in the standards chart; the Constraint picks it up; the next admission rejects deploys that skip it. One standard source. Two enforcement points (Helm schema validation at install, Gatekeeper at admission). Same decision surface.

Model version pinning

The supply-chain discipline I made the case for in the Forgejo and Harbor article extends to model versions. A workload that pulls model.uri: harbor.lab/models/llama:latest is a workload that will silently change behavior the next time latest moves.

The Gatekeeper pattern: a ConstraintTemplate that checks the model.uri annotation on InferenceService objects (or whatever your serving CRD is) for a hash-pinned reference rather than a tag-pinned one. The Constraint applies cluster-wide. Deploys with :latest get rejected; deploys with @sha256:... get through. The integration with the specification layer is the same, the standards chart owns the convention; the Constraint enforces it.

Namespace quotas

GPU is expensive. The right-sized version of “GPU is expensive” is a namespace quota that says how many GPU-pods this team is allowed to consume at once. K8s ResourceQuota handles the basic case; Gatekeeper handles the cases ResourceQuota can’t, like “this team can run at most three concurrent fine-tune workflows” or “this namespace’s combined GPU memory can’t exceed N gigabytes.”

The pattern: a ConstraintTemplate that combines over the existing objects in the namespace (Gatekeeper’s referential data feature), computes the would-be-total if the new request landed, and rejects when the total exceeds the cap. Constraints per namespace declare the per-team caps. The caps themselves come from the standards chart. Same shape.

Network policies for AI workloads

The observability piece and the supply-chain piece both touched on egress control. The Gatekeeper enforcement is the policy that says: every Pod in this AI namespace must have a NetworkPolicy attached that restricts egress to the approved set (the local model registry, the local vector DB, the metrics endpoint, the logging endpoint). Pods without such a policy get rejected at admission.

The integration: the standard egress rule list lives in the standards chart. The standards chart renders both the NetworkPolicy that the workload chart includes and the Gatekeeper Constraint parameters. Both come from the same source. The workload deploys with the policy attached; if someone tries to deploy without it, Gatekeeper catches it.

The “specify, then verify” loop

Pull the four patterns together and what you have is the loop DaC has always been building toward but couldn’t quite close until the K8s ecosystem matured into a clean integration.

Specify. The standards chart defines the standard business decisions, labels, version pinning conventions, namespace caps, network egress rules. Projected into every workload chart via the Helm values methodology.

Verify at install. The standards chart’s values.schema.json rejects workload values that violate the standard structure. Helm refuses to render. The first verification gate.

Verify at admission. Gatekeeper Constraints, parameterized from the same standard sources, reject API requests that violate the standards. The second verification gate.

Verify at runtime. Continuous compliance scans (gator, kube-bench, custom Rego) sweep the cluster looking for objects that drifted post-admission. The third verification gate.

Three gates. One standard source. DaC’s specify-then-verify loop closed in a single foundation.

Why this is the imprint piece

When I look back at the technical contributions I’m most proud of, DaC is at the top of the list. Not because of any single tool (substrates change, vendors change hands, products go into maintenance) but because the methodology aged in a way most of my work has not.

The methodology says: business decisions belong centralized, projected onto every consumer through platform-aware adapters, validated at the boundary, enforced at runtime by a complementary policy engine. In a previous era that was vRA-and-Terraform plus OPA. In 2025 that’s Helm values plus Gatekeeper. In 2027 it’ll be Crossplane compositions plus whatever the next admission controller is. The substrates rotate. The methodology persists.

The thing I want to be explicit about in this piece is that the enforcement complement was never a footnote. It was always the other half of the loop. DaC specifies; OPA verifies. The earlier-era articles couldn’t quite show that integration cleanly because the toolchains were further apart. The 2025 articles can, because the K8s ecosystem matured into exactly the shape the methodology predicted.

This is what “imprint” means in the context of these batches. The pattern doesn’t need a fresh framing. It needs the through-line traced from where it lives now to where it lives in the K8s era expressions, the Helm values piece, the OPA-Rego renaissance article, and the agent-policy-as-code piece.

What I keep coming back to

Specify, then verify. That’s the whole methodology compressed to two words. The earlier-era tools did the specifying well. OPA-era tools do the verifying well. The K8s ecosystem in 2025 lets you do both halves cleanly in a single foundation, with the same standard source feeding both gates.

For AI workloads specifically, the loop matters more than for plain old services because the cost of drift is higher and the surface area is larger. A drifted model version, a missing cost-center label, an unbounded GPU quota, an unrestricted egress rule, any of those is a story you tell on a postmortem.

The fix is DaC. Centralize the decisions. Project them onto every consumer. Validate at install. Enforce at admission. Sweep at runtime. The tools rotate; the discipline doesn’t. Five primitives, two halves of the loop, one through-line. That’s the contribution. That’s why the methodology has held up better than I had any right to expect when I first started building it.