The PII problem nobody wants to own

Every AI team I've talked to has the same problem: private information moves through their tools, and nobody on the team owns it. The gap is real, it's getting wider, and something will force the issue. Here's the breakdown, and who should own it.

A stack of paper folders on a dark wooden desk with a dusty rubber stamp resting on top showing nothing readable

Every AI team I've talked to has the same problem: private information moves through their tools, and nobody on the team owns it. The data leaks in by accident way more often than on purpose, and the rules they'd apply if it were flowing through any other system somehow don't apply here.

It's a real gap, it's getting wider, and something is going to force teams to deal with it, a breach, a regulator stepping in, an audit finding. Whoever waits until that moment is going to spend a lot of money in a hurry. Let me walk through why nobody wants to own this, what fixing it actually looks like, and who should be on the hook for it in your shop.

Four ways private data enters AI tools People paste sensitive stuff into chat boxes Documents handed to AI for help AI wired into other systems CRM, email, files Search indexes over documents RAG over your stuff The gap nobody owns private data flowing through AI All four leaks flow into the same gap. Nobody on the org chart owns it.
Four leak paths, one ownership gap.

Four ways private data enters AI toolsPeople pastesensitive stuffinto chat boxesDocuments handed toAIfor helpAI wired into othersystemsCRM, email, filesSearch indexes overdocumentsRAG over your stuffThe gap nobody ownsprivate data flowing through AIAll four leaks flow into the same gap. Nobody on the org chart owns it.Four leak paths, one ownership gap.

How the data leaks in

The private stuff flowing through AI tools shows up in a few predictable ways. None of these are exotic. All of them happen every day in any team that's actually using AI.

People paste sensitive stuff into chat. Customer names into "summarize this." Bank statements into "explain this transaction." Medical questions into a chat box. The person typing thinks of it like a search engine, but the chat tool remembers everything they typed, and most of the time the vendor on the other end keeps a copy on their side too.

Documents people hand to AI for help. Contracts with names in them. Medical records. HR files. The AI reads them and the data goes wherever the AI sends it. The lock-and-key rules that exist for the same document sitting in a database somehow don't apply here.

AI tools wired into your other systems. Connect the AI to your CRM, your email, your file storage, and the AI sees every name, address, phone number, and document that lives there. Nobody is tracking what it saw or where the data went.

Search indexes built on top of your documents. When a team builds an AI that can search across the company's files (this is called RAG, if you want to look it up later), the search index includes everything those files contain, including the private bits. When the AI pulls a snippet to answer a question, the same access rules that protect the original document don't protect the snippet.

The volume is large. The visibility is low. And nobody is watching.

Want to go deeper on how this actually works? I wrote about the application-layer pattern in PII-aware prompting and the secrets-isolation version in Building an AI assistant that can't see your secrets.

Why nobody owns it

Three reasons the ownership gap sticks around.

It falls between two teams. Privacy and legal traditionally own PII. Engineering and product own AI. The overlap (private data flowing through AI) falls in the gap. Privacy doesn't have the technical chops to fix it; engineering doesn't know what the rules actually require.

No single leak looks bad. Somebody pasted a customer name into a prompt. A document with employee data went to a model. Each event is small. Add them up over a year and the picture is ugly, but nobody can point at a specific incident and say "that's the one."

Nothing is forcing the issue yet. Regulators are paying attention but enforcement is patchy. Audit frameworks like SOC 2, ISO 27001, and HIPAA are starting to ask AI questions but not pressing hard. There hasn't been the big public breach yet that makes every executive want a plan by Monday.

The work is boring. Building the privacy controls doesn't ship a feature. It doesn't help hit a quarter. The engineer who proposes the work has to argue for it against everything else competing for time. They usually lose.

Put those four together and the math is predictable: everyone sees the problem, nobody is positioned to fix it, the work doesn't happen.

What fixing it actually looks like

The work isn't research. It's mostly patterns I've covered before, applied with discipline.

  • Strip the private bits before the prompt gets sent. Local redaction at the moment the prompt is built, names get replaced with placeholders, the AI sees the structure but not the identity. PII-aware prompting is the full version.
  • Only index what should be indexed. When you build a search index over company documents, exclude the documents that hold the most sensitive stuff. Same pattern I described for an AI assistant that can't see your secrets, extended to PII.
  • Isolate the capabilities that touch private data. The AI plans the action; a separate, smaller piece with the actual access carries it out. The AI never sees the raw private data, only the result of an action against it.
  • A governance framework that engineers will actually use. I wrote up the lightweight version in AI governance frameworks that don't make engineers quit. Add explicit categories for any workflow that touches private data.
  • Audit trails that capture the flow. Every AI interaction with sensitive data gets logged. You should be able to answer "what did the AI see, when, on whose behalf?" in a query.
  • A clear owner. A name on the org chart, accountable for the program, with budget and authority. This is the missing piece that makes everything else possible.

None of this is exotic. The shop that does it is operating safely. The shop that doesn't is racking up exposure.

Who should own it

This is where most orgs get stuck. The honest answer: it has to be a role that bridges the privacy-and-AI gap. A few shapes I've seen work in mature deployments and the ones I've read about:

A privacy engineer who can write code. The privacy program hires an engineer with the technical skills to build the actual controls. Works when privacy has the budget and authority. Doesn't work when privacy is downstream of legal and treated as paperwork.

An AI platform owner with a privacy mandate. The team building the AI platform takes privacy work as part of their charter. Works when the platform team is mature. Doesn't work when the platform team is treated as plumbing.

A standalone "AI governance" role. A new hire specifically for the cross-cutting work. Works at orgs big enough to justify the headcount. Doesn't work at small shops where the role is too narrow to fill.

The CISO's office picks it up. Security adopts AI-PII as part of broader data protection. Works when the CISO has authority over engineering. Doesn't work when security is advisory.

None of these is universal. The right answer depends on what your org already looks like. The wrong answer is "we'll figure it out organically", that translates to "nobody owns it" and the gap stays open.

What's going to force the issue

The teams that don't address this on their own terms will eventually be forced to. Here's what to expect.

A high-profile breach where the data flowed through an AI tool. This happens within the next year for some major vendor. The aftermath brings regulator attention and industry-wide audits. Teams without a story get caught.

A regulatory enforcement action. GDPR, CCPA, HIPAA, GLBA, all of these apply cleanly to AI-PII. The first big enforcement action sets the precedent, and every CIO with exposure starts writing checks.

Audit framework changes. SOC 2, ISO 27001, and the industry-specific frameworks will explicitly add AI-PII controls. Once the audit cycle includes them, every audited org either has the controls or has to explain why not.

Customer requirements in vendor agreements. Enterprise customers will start asking AI vendors and AI-using vendors for explicit PII-handling commitments. Vendors without an answer lose deals.

Any one of these is enough to make the work mandatory. Most of them are coming.

The takeaway

The work isn't hard, it just isn't anyone's job yet. Whoever owns this in your shop (or whoever steps up to own it) will save the org a multiple of their salary over the next two years. That's the pitch. Take it to whoever signs your offer letter.