The MVP question for an AI product

Most AI MVPs over-build the model and under-build the product loop. Here's the question that fixes that, and the eighteen pieces that follow it.

Sid Smith

13 May 2026 • 7 min read

A friend who runs a small consulting practice asked me last month, "If I wanted to turn what I do into an AI product, what would I actually build first?" She's a marketing strategist. She's been doing brand-positioning work for fifteen years. She has a stack of client engagements, a few decks she reuses, and a way of asking three questions in a discovery call that gets people to admit what their brand actually is. That last part (the three questions, the way she listens, the gentle reframes) is the thing she sells. Everyone she talks to has the same problem. None of them know how to package it.

She wanted to know what "MVP" looks like for that.

The MVP loop for an AI product

I've been answering some version of this question once a week for a year now. The sales consultant with a discovery framework. The IT ops person with a triage tree they've been refining since 2014. The legal pro with a contract-review checklist that catches the three things outside counsel always misses. Everyone has secret sauce. Almost nobody has a working AI product yet. The gap between those two states is not technical. It's a product gap, and it's almost always the same gap.

So I'm writing eighteen pieces on it. This is the first.

What "minimum viable" actually means here

When people say MVP for a normal software product, they mean "the smallest thing a customer would pay for, even if it's ugly and slow." That definition still works when the product happens to use AI. The trap is that the AI part is shiny and complicated, and the team gets pulled into making the model better instead of making the product real.

I have watched this happen so many times.

A team will spend three months tuning a fine-tune (this is when you take a general-purpose AI and train it on your own examples, if you want to look it up later). They'll get the model from 73% accurate on their internal test set to 81% accurate. Meanwhile they don't have an auth system, the customer can't see why the AI gave a particular answer, there's no way to push a fix when the AI is wrong, and the consultant whose secret sauce this is supposed to be has no way to teach the system anything new. The model is great. The product doesn't exist.

The question I'd actually start with is: what does one full loop of customer use look like?

Customer arrives. Customer asks. AI gives a draft answer. Consultant reviews it. Consultant approves, edits, or rejects. Customer sees the final answer. The system remembers what happened so it can do better next time. That's the loop. Everything else is decoration until that loop runs end to end.

Most AI MVPs I see have a fancy model in the middle and the rest of the loop drawn in pencil. The MVP question is which parts of that loop you can build cheaply and which parts you have to build well from day one.

The bit that surprises people

Here's the part that surprises people: the model is not where the value is. The model is a commodity. You can rent a very good general-purpose model from AWS or Anthropic or OpenAI by the token. The reason your product is worth money is the consultant's judgment, not the underlying AI.

So the MVP question becomes: how do I capture the consultant's judgment and put it in the loop?

That's a different question than "how do I make the model better." It's a much more useful question. It changes what you build first. You stop polishing the model and start building the parts of the system that capture, version, and apply the consultant's expertise.

I'll write the next piece on what "captured judgment" actually looks like in practice, annotated examples, decision rules, escalation paths, the failure modes the consultant has personally lived through. For now, just hold the idea: the secret sauce is what the consultant knows. The product is the loop that uses it.

Want the version that's just about the capture path? I wrote about what actually trains the AI as a companion to this piece, same idea, more depth on the captured-material side.

The architecture is the same. The sauce is yours.

Here's the framing I keep coming back to. Take three different consultants:

A financial advisor who does portfolio diagnosis. Client uploads their holdings. The system flags the things this advisor would flag, concentration risk, the specific funds they think are overpriced, the tax-lot mistakes they've seen ruin people.
A product manager offering decision-coaching as a service. Client describes a roadmap dilemma. The system applies this PM's framework, the questions she'd ask, the second-order effects she'd surface, the way she pushes back when someone is solving the wrong problem.
A medical specialist doing second-opinion review. Patient uploads their workup. The system flags what this doctor would flag, the diagnostic possibilities the first physician didn't consider, the specific tests that would discriminate between them.

Three completely different markets. Three completely different secret sauces. The architecture underneath them is the same. The same auth. The same data store. The same retrieval layer. The same review queue. The same audit log. The same pattern for capturing judgment, fine-tuning a small model on it, and pushing the result back into the live loop.

If you have built one of these well, you have built all of them. What changes is the captured material, the examples, the rules, the voice, the failure modes.

This is the framing for the whole series. The architecture is general. The sauce is yours.

I'm going to use this as a structural device. Each piece in the series picks 2-3 different consultant verticals as illustrations. Not because consultants are the only audience (anyone shipping an AI product will recognize the pattern) but because consultants make the secret-sauce question concrete in a way that "enterprise customer" doesn't.

What this series covers

Eighteen pieces, daily for two weeks. Two pieces a day for the first stretch because the first few topics are tightly coupled. The shape:

Pieces 1-2 (this one and the next): the framing. What's an MVP for an AI product, and what does the captured judgment look like that drives it.
Pieces 3-5: the split between cloud and local compute, what each is for, and the actual shapes you'd build on each side.
Pieces 6-8: the foundation layers, auth, multi-tenancy, the data store, retrieval (this is called RAG, retrieval-augmented generation, if you want to look it up later).
Pieces 9-12: the product loop itself, triage, diagnose, resolve, prompts as code, evals, the human-in-the-loop approval gate that goes away once the system earns trust.
Pieces 13-15: the operational realities, observability, audit, graceful degradation when something fails, the cost model.
Pieces 16-17: deployment shape and the hybrid sync pattern, how the cloud side and the local side actually talk to each other.
Piece 18: the closer. What I'd cut from day one. What I wouldn't. What the first thirty days of customer use will teach you.

Then four more pieces after that on operating the product in year one. Model selection, onboarding new consultants, the two surfaces (customer and consultant) that share one backend, and the pricing-model decision tree.

What I'd want you to do with this

Don't try to absorb the whole architecture in a sitting. Each piece stands on its own. If you're a consultant thinking about productizing your sauce, the early pieces (1, 2, 8) are the ones that matter most. If you're a small-team founder building an AI product right now, the cloud and ops pieces (4, 6, 13) are probably where you'll find the holes in what you've already started.

If you're a curious person who wants to know what "AI product" actually means under the marketing, you can read the whole thing as a tour.

The thing I want anyone to take from this first piece is the reframing. Stop optimizing the model. Start building the loop. The model is a rented engine. The loop is your product. The captured judgment is what makes it worth paying for.

A small note on what this series isn't

A few things this series will deliberately not be.

It will not be a tutorial. There's no "step 1, run this command" pattern in any of the eighteen pieces. The reason is that the actual commands change every six months as cloud services and tooling shift, and a tutorial dated 2026-05 will be wrong by the time someone reads it in 2026-09. The architecture, the trade-offs, and the questions you have to answer for yourself, those don't go stale nearly as fast.

It will not be a sales pitch for AWS, or for Apple, or for any specific model vendor. I'm using AWS for the cloud examples because I know it well and the managed-service shape is mature there. Most of the architecture maps cleanly to GCP or Azure if that's where you live. The Mac Studio recommendation is real. Apple Silicon's unified memory makes it the best per-dollar machine for local AI work right now, but if the right answer for your team is a Linux box with a discrete GPU, the same principles apply.

It will not pretend any of this is easy. Building an AI product that customers actually pay for is hard. Most of what's hard is not the AI. Most of what's hard is the same stuff that's hard about any product, getting the loop right, getting the trust right, getting the boring infrastructure right so that the interesting parts have somewhere to stand.

What this series will be: opinionated, specific about the shapes I've found that work, honest about where I've gotten things wrong, and structured so you can pick the pieces you need without reading the whole thing top to bottom.

Tomorrow's piece is on capturing that judgment. What it looks like, what it isn't, and the small mistake almost everyone makes when they try to write it down.

Where's this all going? The closer in the series, what I'd cut, what I'd keep, is the piece that ties this back to a concrete day-30 cutline. It's worth jumping ahead to if you want the punchline before the build.