Building an AI assistant that can't see your secrets

The personal AI assistant pattern wants to read everything you have. The honest engineering pattern is the opposite, design the assistant to be useful while structurally unable to see the data that shouldn't go to it. Worth being concrete about how.

A heavy ornate brass vault door slightly ajar revealing a glowing computer chip inside the vault chamber

The default pattern for personal AI assistants in 2025 is "give it access to everything and trust it to use the access well." Read your email, your messages, your files, your calendar, your password manager, your notes. The pitch is "the assistant is most useful when it sees the most." The cost is that the assistant becomes a single point of failure for the most sensitive data you have.

There's a better pattern that almost nobody is shipping: design the assistant so it can't see the secrets in the first place. Not "we promise not to look", actually-can't-look-because-the-design-doesn't-let-it. Worth being clear about how that works because the patterns are gettable today and most people who'd benefit from them don't know they're an option.

The AI plans. A separate piece acts. The AI plans the action Capability holds the secret · acts send_email(to, subject, body) create_ticket(title, body) post_message(channel, text) API key OAuth token DB password structured request result, not secret The AI never holds the key. The capability never makes a decision.
The AI plans. A separate piece acts.

The AI plans. A separate piece acts.The AIplans the actionCapabilityholds the secret · actssend_email(to, subject, body)create_ticket(title, body)post_message(channel, text)API keyOAuth tokenDB passwordstructured requestresult, not secretThe AI never holds the key. The capability never makes a decision.The AI plans. A separate piece acts.

The default pattern and what's wrong with it

The vendor-default version of personal AI assistants:

  • Indexes everything in your data sources, including secrets, into a vector store.
  • The model sees retrieved chunks at query time, including any secrets that match the retrieval.
  • The model's context window includes whatever the system prompt and conversation history contain, indefinitely.
  • The platform vendor controls all of this and can change the rules at any time.

The failure mode: any prompt-injection vector against the assistant has access to the secrets. Any retrieval miss that surfaces a secret embeds it in the conversation. Any vendor change to data-handling can retroactively expose data you indexed years ago. The assistant's value is correlated with how much it sees; the security posture is correlated with the same thing, in the wrong direction.

The honest assessment: for most personal-AI use cases, the assistant doesn't need to see the secrets to be useful. The few cases where it genuinely needs them are narrow and can be handled with explicit per-task hand-off rather than blanket access.

The four patterns that work

Scoped indexing. Your indexed corpus is split by sensitivity. The default retrieval index covers your finished writing, your published code, your meeting notes, the stuff that's basically already public. A separate index covers semi-private material (drafts, personal notes). Sensitive material (credentials, private correspondence, financial records) is not indexed at all. Retrieval defaults to the public index; semi-private is opt-in by query; sensitive is unreachable from the assistant.

This is the base pattern. Most of the value of personal AI comes from retrieving against your accumulated working knowledge. Most of your accumulated working knowledge isn't sensitive. The split is straightforward to maintain in practice.

Redaction at ingest. When you do need to ingest something with embedded secrets, a config file with credentials, a contract draft with party names, a medical record, run it through a redaction pass before the model sees it. The redaction can be regex-based for well-formed secrets (API keys, credit card numbers), pattern-based for structured data (PII patterns), or LLM-based for harder cases (named-entity redaction). The original stays on disk; the model sees the redacted version.

The local pattern works well here: a small local model running on the Mac mini in my setup can do the redaction pass without any network round-trip, and the redacted output is what gets indexed.

Capability isolation. The assistant doesn't have credentials to anything sensitive. When you need to do something sensitive (send the email, make the bank transfer, deploy the change), the assistant prepares the action and hands it off to a separate capability that has the credentials. The capability runs the action; the assistant never sees the credentials, never holds the session, never has the ability to do the action on its own.

This is the human-in-the-loop checkpoint pattern extended to credential-bearing capabilities. The assistant becomes the planner; the capability is the executor; the credentials live with the executor and never leak to the planner.

Per-task ephemeral context. When the assistant does need to handle sensitive context for a specific task, the context is loaded plainly, used for that task, and cleared before the next task starts. The conversation history doesn't accumulate the sensitive context. The retrieval index doesn't index the sensitive context. The lifecycle is task-scoped.

The memory hygiene piece covers the mechanics of explicit context clearing. The pattern extends naturally to "load this sensitive doc, do the task, forget it" workflows.

What this looks like in practice

A concrete example from my own setup. I want the assistant to help me draft a response to a vendor contract negotiation. The contract has confidential terms; the existing draft has my markup; my prior correspondence with the vendor has things I don't want indexed permanently.

The workflow:

  1. The contract draft and the prior correspondence get redacted (party names, monetary terms, dates) by a local redaction pass.
  2. The redacted versions are loaded into a per-task context.
  3. The assistant works against the redacted versions to draft proposed language.
  4. The proposed language comes back to me; I un-redact and send.
  5. The per-task context is cleared. Nothing sensitive persists in the conversation.

The assistant did the work; the credentials and sensitive specifics never reached the model layer; the working artifacts cleaned up after themselves. The pattern feels heavier than "just send the contract to ChatGPT" because it is. It's also the pattern that actually works for sensitive workflows.

The smaller-pattern version for casual users

Not everyone needs the full enterprise-style version. For most casual users, the meaningful uplift comes from three things.

Scope your indexed corpus deliberately. Don't index everything. Index your notes folder, your published writing, your code. Don't index your password manager, your email-export, your private journal. The default retrieval gets you most of the value; the cost of skipping the sensitive sources is essentially zero.

Don't paste secrets into chat. Sounds obvious; constantly violated. If you wouldn't paste it into a Slack channel, don't paste it into your AI assistant. The assistant keeps it in conversation context, often longer than you expect.

Use local-only models for the sensitive work. When you genuinely need the model to see a sensitive document, route that query to a local model running on hardware you own rather than to a hosted API. The data never leaves the network; the audit story is your own; the risk surface is bounded.

These three habits get most of the security value without the full design investment. They're the floor that should be common practice; they aren't yet.

The platform shape that's emerging

The platforms that take this seriously are surfacing the pieces.

  • Capability isolation is what proper MCP-server design enables. Tools that hold their own credentials and never expose them to the model are doable today and increasingly the default pattern in well-designed MCP servers.
  • Scoped indexing is what good RAG-platform design supports. The platforms that make per-source scope enforceable are the ones that scale to production use.
  • Redaction at ingest is increasingly a first-class step in the indexing pipeline. The local-model redaction pattern is gettable today.
  • Ephemeral context is what proper per-task conversation primitives support. Several platforms have this; most don't expose it cleanly.

We're in the middle of the platform conversation catching up to the engineering pattern. The people who design for the treat-the-AI-like-an-employee discipline end up at this same design by another path. The marketing layer is mostly elsewhere (busy promising "your AI sees everything and uses it well") and the engineering-on-the-ground layer is converging on the opposite-shaped answer.

What I'd recommend

For someone building or configuring a personal AI assistant in mid-2025:

  • Default to scoped indexing. The retrieval surface should index your working knowledge, not your sensitive material.
  • Use redaction passes for any ingest that has embedded secrets. A local model is fine for this.
  • Keep credentials out of the model layer. Capabilities that need credentials should hold them; the assistant should plan, not execute.
  • Use per-task ephemeral context for sensitive workflows. Load, work, clear, repeat.
  • Pick platforms that surface these primitives rather than working around platforms that don't.

The pattern doesn't make the assistant less useful. It makes the assistant useful within a security envelope that holds up under adversarial conditions. The default pattern is faster to set up; the secured pattern is the one that doesn't fail you when something goes wrong.

The assistant that can't see your secrets is the assistant that can't leak them. That's not a contradiction; it's the right design constraint.