Cloud

Plan output as data: what terraform plan -json actually enables

Most Terraform pipelines treat plan output as text, paste it in a PR, hope the reviewer reads it. The JSON form is structured data, and once you treat it that way, cost preview, policy gates, drift attribution, and change-risk scoring become engineering problems.

Sid Smith

12 Mar 2024 • 9 min read

There is a pattern in the Terraform pipelines I’ve been building on customer engagements that I want to write up, because almost nobody implements it on the first attempt and almost everybody asks for it eventually. The pattern is treating Terraform’s plan output as a structured artifact rather than as a string of text to paste into a PR comment.

The pattern is small. It’s three lines in a GitHub Actions workflow:

- run: terraform plan -json -out=tf.plan > plan_log.jsonl
- run: terraform show -json tf.plan > plan_output.json
- run: terraform show tf.plan > plan_output_raw.log

Three artifacts. Each one is the same plan, formatted differently. Each one enables a different downstream capability. Once they exist as files attached to the PR, the things you can build on top of them are dramatically more useful than the “human reads a text diff” baseline that most teams stop at.

This piece is what each artifact actually contains, what the tooling of tools that consume them looks like in early 2024, and why the JSON form of the plan should be a first-class artifact on every Terraform PR you ever ship.

The three artifacts

plan_log.jsonl, the streaming event log. This is the output of terraform plan -json, which is fundamentally different from terraform show -json. The plan -json flag tells Terraform to emit a newline-delimited stream of JSON events as the plan runs. Each event has a type field, planned_change, resource_drift, diagnostic, apply_start, apply_complete, outputs, and a timestamp.

The streaming format is what you want for real-time pipelines. If you’re building a CI bot that posts incremental status updates, or a dashboard that shows plan progress, the JSONL stream is the right input. You can parse it line-by-line as it’s produced, react to specific event types, and build progress UI on top of it.

{"@level":"info","@message":"acme-vpc: Refreshing state...","@timestamp":"2024-03-10T14:23:01Z","type":"refresh_start","hook":{"resource":{"addr":"aws_vpc.main"}}}
{"@level":"info","@message":"Plan: 3 to add, 1 to change, 0 to destroy.","@timestamp":"2024-03-10T14:23:14Z","type":"change_summary","changes":{"add":3,"change":1,"remove":0,"operation":"plan"}}

The JSONL is also where diagnostics live. Warnings, errors, deprecation notices, all of them are events in the stream, structured, parseable. If you’ve ever tried to grep terraform plan text output for warnings, the JSONL version is the thing you wanted.

plan_output.json, the final plan state. This is terraform show -json tf.plan, run on the saved plan file. It’s a single, complete JSON document describing the result of the plan, the resource changes, the configuration, the prior state, the planned state, the output values.

This is the artifact policy tools consume. It’s deterministic, complete, and structured. The schema is documented (it’s the Terraform plan representation), and the schema is stable enough across Terraform versions that tools can rely on it.

The top-level fields:

format_version, schema version of the JSON itself.
terraform_version, which Terraform version produced the plan.
prior_state, what Terraform thought the world looked like before the plan.
configuration, the parsed configuration, normalized.
planned_values, what the world will look like after the apply.
resource_changes, the diff, per resource, with before and after values and an actions array (["create"], ["update"], ["delete", "create"], etc).
output_changes, the diff for output values.

The resource_changes array is where 90% of useful tooling lives. Each entry has the resource address, the action being taken, and the full before/after attribute set. Cost-preview tools read it. Policy tools read it. Custom drift-attribution scripts read it. It’s the structured form of “what is this PR going to do.”

plan_output_raw.log, the human text. This is the plain terraform show tf.plan output. It’s what humans actually read. It’s also what gets pasted into PR comments by every “post-plan-to-PR” bot.

The reason you produce all three is that they serve different audiences:

The streaming JSONL serves the pipeline, error parsing, incremental status, build artifacts for downstream stages.
The full JSON serves the tools, cost, policy, security scans, drift attribution, change-risk scoring.
The raw text serves the humans. PR reviewers reading the diff.

The cost of producing all three is essentially zero. The plan file is the source for both JSON forms; the streaming JSONL comes from a flag on the plan command. Two seconds of pipeline time. Three artifacts.

What you can build on the JSON plan

This is the part of the conversation I keep having with customers, because once they have plan_output.json as an artifact on every PR, the question shifts from “how do we improve our Terraform reviews?” to “what should we feed this into?”

The categories that are working in production today:

Cost preview. Infracost is the standard example. It takes plan_output.json, looks up the pricing of every resource in the planned_values set, computes monthly cost impact, and posts a delta to the PR. “This change adds $1,847/month.” That number alone catches a category of mistakes, the engineer who left instance_count = 50 instead of 5, the developer who wrote a Lambda config that will eat the Free Tier in a day, before they hit production.

The trick that makes cost preview work well is the JSON plan’s planned_values field. Without it, you’d have to apply the change to know what was created. With it, you know the exact resource shape and count before anything is created.

Policy gates. Conftest (using Open Policy Agent), terraform-compliance, regula, Sentinel, these all consume plan_output.json and evaluate rules against the planned resources. The rules can be:

“All S3 buckets must have versioning enabled.”
“No security group can have 0.0.0.0/0 ingress on port 22.”
“RDS instances must use customer-managed KMS keys.”
“EC2 instance types in the c6i family cannot be used in the eu-west region until our reservation is in place.”

The policy is code. The plan is data. The CI step is “run the policy against the plan.” If the policy fails, the PR is blocked.

The deeper move is that the policy gate runs before apply, on the planned state, not on the deployed state. You catch the violation when it’s a five-line diff in a PR, not when it’s a deployed resource that’s already been running for a week.

Drift attribution. When terraform plan shows drift (a resource whose live state doesn’t match what’s in the state file) the JSON plan tells you exactly which attribute drifted, what the previous value was, what the live value is. With that data structured, you can build a report that combines drift across all your workspaces and tells you which attributes drift most often, which teams own them, and what the underlying cause is.

I wrote about drift last November as a category problem. The plan-as-data pattern is how you get a real measurement on it.

Change risk scoring. This one is less mature but very useful when it works. The idea: assign every resource type a risk weight, score every plan by the sum of weights of resources being changed, flag any PR over a threshold for extra review.

A destroy on an RDS instance scores high. An update to an EC2 instance scores medium. A tag change on an S3 bucket scores low. The score doesn’t replace human judgment, but it routes PRs to the right reviewer. The CI bot can post “this PR scores 47 on the change-risk scale, please tag the on-call engineer for review” without anybody having to read the diff first.

The risk weights are organization-specific (a destroy on a dev database is fine, a destroy on the production database is not) and the scoring rules live in the same policy-as-code system that runs your gates. The data is the resource_changes array. The scoring is a function over it.

Compliance evidence. For organizations doing SOC 2, ISO, FedRAMP, the JSON plan is structured evidence. “On 2024-02-14, PR #1247 proposed these changes, the policy gates passed, the change was approved by reviewer X, and the apply produced this CloudTrail event.” That whole chain is auditable. The auditor doesn’t read text diffs; they read structured records.

The tools that consume it

A non-thorough list of what’s been working in production, as of early 2024:

Conftest (Open Policy Agent’s CLI for unit-testing structured config). General-purpose policy engine. Rules are written in Rego. Works on the plan_output.json. Good fit if your org already uses OPA elsewhere.

terraform-compliance. BDD-style policy framework. Rules are written in a Gherkin-like syntax (“Given a resource of type X, Then it must have a tag Y”). More approachable for teams that don’t want to learn Rego.

Regula. Fugue’s open-source policy engine, also OPA-based but with a Terraform-specific rule library out of the box. Useful if you want a starting set of common policies (CIS, NIST, etc.) without writing them yourself.

Infracost. Cost preview specifically. Has a free SaaS tier for the price lookups; the binary itself is open-source.

Checkov and tfsec. Static analyzers for Terraform configuration, they can also read the plan JSON for some checks. Less plan-aware than the policy-engine tools but worth mentioning because they’re widely deployed.

Sentinel. HashiCorp’s policy framework, integrated with Terraform Cloud / Enterprise. Tied to the commercial product. The rules are similar in shape to OPA but in a different language.

Custom scripts. Plenty of teams write their own consumers, a Python script that reads plan_output.json, runs the org-specific checks, and posts a summary to the PR. The plan JSON schema is stable enough that a few-hundred-line script can be a real policy engine for a small org’s needs.

The point of listing all of these is that none of them require you to instrument Terraform. They all consume the same artifact. Produce plan_output.json once, route it to any subset of these tools, get the relevant outputs.

What stops teams from doing this

The blocker on most engagements is not technical. It’s that the team’s current workflow already “works”, humans read text diffs, mostly catch the obvious mistakes, occasionally miss something expensive, and the cost of changing the workflow feels higher than the cost of the next mistake.

The argument that lands in practice is the audit-trail one. Showing a leader the chain of “PR → plan JSON → policy result → approver → apply → CloudTrail event” is more convincing than showing them a list of tools to evaluate. The chain is the value. The tools are the means.

The second blocker is the perception that the plan JSON is “for big enterprises.” It isn’t. A two-person team running Terraform against one AWS account benefits from the cost-preview gate on day one. Infracost in a small team catches the wrong instance type before it becomes a surprise bill. The JSON artifact doesn’t need to fund a security organization to be worth producing.

The third blocker is the wrapper-output problem I touched on in the OIDC piece. If you’re using setup-terraform@v3 with the default terraform_wrapper: true, your plan output gets eaten by the wrapper’s stdout capture, and the -json flag’s output doesn’t end up where you expect. Setting terraform_wrapper: false fixes it. This is the single most common reason the three-artifact pattern doesn’t work on the first attempt.

The shape I now ship

The CI step I now drop into every customer pipeline looks like this:

- name: Terraform plan with structured artifacts
  run: |
    terraform plan -json -out=tf.plan > plan_log.jsonl 2>&1
    terraform show -json tf.plan > plan_output.json
    terraform show tf.plan > plan_output_raw.log

- name: Upload plan artifacts
  uses: actions/upload-artifact@v4
  with:
    name: terraform-plan-${{ github.run_id }}
    path: |
      tf.plan
      plan_log.jsonl
      plan_output.json
      plan_output_raw.log
    retention-days: 30

- name: Policy gate
  run: conftest test plan_output.json --policy ./policies/

- name: Cost preview
  run: infracost diff --path plan_output.json --format github-comment > infracost.md

- name: Post to PR
  uses: actions/github-script@v7
  with:
    script: |
      const fs = require('fs');
      const raw = fs.readFileSync('plan_output_raw.log', 'utf8');
      const cost = fs.readFileSync('infracost.md', 'utf8');
      const body = `## Plan\n\`\`\`\n${raw}\n\`\`\`\n\n## Cost\n${cost}`;
      github.rest.issues.createComment({
        issue_number: context.issue.number,
        owner: context.repo.owner,
        repo: context.repo.repo,
        body: body
      });

Five steps. Plan with all three artifacts. Upload them. Run the policy gate. Run the cost preview. Post the human-readable text and the cost preview to the PR.

The thing that makes this stick is that the JSON artifacts are stored on the run, retrievable for thirty days. If something goes wrong after the apply, the plan that produced it is still there, structured, queryable, attached to the exact PR and commit.

What to do, concretely

If you’re running Terraform in CI without the plan-as-data pattern, the moves I’d make this sprint:

Add the three-artifact emit to every workflow. Two seconds of pipeline time. The artifacts are useful even before you wire anything up to consume them, having the structured plan on every PR is its own audit improvement.

Pick one consumer and wire it up. Cost preview (Infracost) is the easiest to argue for. Policy gates (conftest with a small rule set) is the second easiest. Don’t try to deploy three tools at once.

Disable terraform_wrapper. If you’re using setup-terraform, this is the one-line config that makes everything else work.

Retain the artifacts. Thirty days minimum. The forensic value of being able to pull up the exact plan that produced a deployed resource, weeks later, is high.

Treat the JSON plan as the standard artifact, not the text. The text is for humans reading PRs. The JSON is for everything else.

The longer thread

The shift from “plan output is text” to “plan output is data” is the same shift IaC made when it stopped being shell scripts and became HCL. Once the artifact is structured, the things you can build on top of it compound. Cost gates. Policy gates. Drift attribution. Change-risk scoring. Compliance evidence. None of these existed as ergonomic features when the plan was a text blob; all of them are tractable engineering problems with the JSON.

The deeper point: every IaC pipeline I’ve seen treats the plan as ephemeral, produced, read, discarded. It shouldn’t be. The plan is the most concentrated source of truth about what your infrastructure is about to become. Storing it, structuring it, and feeding it to downstream tools is the cheap move that pays off on every change going forward.

The three-line YAML diff is the start. The category of capabilities it unlocks is the reason.

, Sid