Automation

Memory hygiene for daily AI work

The conversation history with your AI tools is a piece of infrastructure now. Treating it like ephemeral chat is fine for the casual case; treating it that way for daily work means you lose context, repeat yourself, and slowly poison your own retrieval.

Sid Smith

09 Jul 2025 • 5 min read

The conversation history with your AI tools used to be an afterthought. You typed at it, it answered, you closed the tab, none of it mattered. That was true when "AI tool" meant "occasional ChatGPT query." It stopped being true when AI tools became a daily-work surface you keep coming back to, and especially when those tools started persisting context across sessions in serious ways.

After a year-plus of using these tools every day. Here's the pattern I've landed on: the conversation history needs the same kind of hygiene any working surface needs. Not the careful-archival kind, the working-set kind. The platforms aren't doing it for you, and the failure modes pile up quietly.

Four memory layers. Most leaks happen when something jumps a ring.

What goes wrong without hygiene

Three failure modes I've watched compound over months of use.

Context drift. Your assistant remembers a decision you made three weeks ago that you've since reversed. It applies the old decision in a new context, you don't notice right away, and downstream work is subtly wrong. The pattern is most visible in coding-assistant work where stale design decisions get applied to new files; it shows up everywhere.

Retrieval poisoning. Your local-AI assistant indexes your notes for retrieval. The notes include early drafts, abandoned ideas, half-formed thoughts. The retrieval surface treats those with the same weight as your finished work, and the summaries it produces start to reflect the poisoned mix rather than the actual signal.

Identity drift. The way your assistant talks to you changes over weeks because the conversation history picks up patterns the system prompt didn't anticipate. Sometimes the drift is fine. Sometimes you notice that your assistant now agrees with everything you say, or hedges everything, or has picked up your worst stylistic tics. The persona drifts; you lose the value of having a consistent collaborator.

Cost compounding. Long-context conversations cost meaningfully more per turn than short ones. A daily-work conversation that runs for weeks accumulates token cost faster than the per-call price suggests. This is a meaningful slice of the long-running agent problem at the personal-AI scale.

The hygiene patterns that work

Five habits that have survived months of practice.

Periodic conversation reset. Every couple of weeks, start a fresh conversation. Tell the new conversation what you're working on, hand it the relevant context, and let the old conversation expire. The discipline is hard because the long-running conversation feels valuable; the value is mostly nostalgic and the cost is real.

Explicit memory writes. When the assistant produces a useful piece of context (a decision, a fact, a preference) tell it plainly to commit that to long-term memory. Don't rely on it inferring what to keep. The platforms with explicit memory APIs (Claude's MCP-memory, ChatGPT's saved memories) make this easier; the ones without it require manual notes.

Scoped retrieval. When the assistant retrieves from your indexed corpus, scope the retrieval to the relevant document set. Indexing your finished work and your draft folder separately, and retrieving only from finished by default, prevents the retrieval poisoning failure mode. The cost is the indexing pipeline gets slightly more complex; the benefit is the assistant stops hallucinating from your old drafts.

Deliberate forgetting. Some context should not persist past the immediate task. Hand the assistant a sensitive document, get the analysis you need, then clear that document from the conversation context. The platforms with proper context-clearing make this clean; the platforms that don't require manual workflows.

Persona reaffirmation. Now and then, restate the persona and constraints the assistant is operating under. The fastest version is "remember, you are X, you do Y, you don't do Z." The persona drift over weeks is real; the reaffirmation costs a few tokens and prevents it.

The platform shape that's emerging

The pattern across the platforms doing this well: explicit, named memory primitives. Claude's MCP-memory servers let you read, write, and forget memories as first-class operations. OpenAI's saved memories surface lets you list and delete what's there. The platforms without explicit primitives leave you to roll your own, which most people don't, which is why most people's assistants slowly degrade in quality without them noticing.

The longer-term direction: memory becomes a separate concern from inference. The model is one piece, the memory layer is another, the retrieval layer is a third. Each can be operated and maintained on its own. We're partway there; the wiring isn't all standardized yet, and the people who treat memory as its own infrastructure outperform the ones who treat it as a side effect of conversation.

What I do in my own setup

The actual hygiene routine for my home setup:

Weekly reset on the daily-work conversation. Monday morning I start a new conversation, hand it a quick brief on what's active, let the prior week's context fall off.
Explicit memory commits at end-of-day. A 5-minute end-of-day note where I tell the assistant what to remember about the day's work, decisions made, things to follow up on, blockers parked.
Separate retrieval indexes for finished work vs. working drafts. Default retrieval is from finished. Working-draft retrieval is opt-in by query.
Sensitive-context wipes. When I hand it something I don't want it to remember (a PDF I'm reviewing, a contract draft, a credential rotation), I clear the context after the task and tell it to forget.
Monthly persona check. Once a month, I read a few exchanges with the assistant looking for drift. If the answers feel off-tone, I reaffirm the persona.

None of this is exotic. The reason it's worth writing down is that the platforms don't do it for you and the failure modes pile up quietly. The people who've been using these tools daily for a year-plus have all converged on some version of these habits; the ones who haven't are slowly noticing that their assistants got worse and not understanding why.

The bigger frame

Memory is the part of the AI stack that's still mostly amateur in production. The frontier-tier models are well-engineered; the inference platforms are mature; the agent surfaces are improving. Memory is where the most preventable degradation happens, and where the biggest gains from a few good habits are available.

It's not glamorous work. The same systems-engineering discipline I pointed at in the prompt-architecting piece extends to memory too: think about it carefully, give it the same care you'd give any working part of your stack, treat hygiene as a practice rather than a one-time setup. The tools that make this easy are improving; the habits have to stay ahead of the tools.

Daily AI work without memory hygiene is daily AI work that gets worse over time. The investment to keep it clean is small. The return is staying useful.

What goes wrong without hygiene

The hygiene patterns that work

The platform shape that's emerging

What I do in my own setup

The bigger frame

Subscribe to Echoes of the machine