Onboarding new tenants: the five-minute path from signup to working AI

When a consultant signs up, how do they get from 'I have secret sauce' to a live AI surface in five minutes? Onboarding as a first-class feature.

Onboarding new tenants: the five-minute path from signup to working AI

Last week's piece was about picking the right Bedrock model on evidence. This week is about the moment before any of that matters: a new consultant has just signed up and is staring at a blank tenant.

This is the moment every AI product gets wrong. The marketing site promised "your AI assistant, trained on your expertise." The signup flow took 90 seconds. Then the new tenant lands on a dashboard that says "Upload your knowledge base to begin" and they have no idea what that means, no idea what shape the upload should take, no idea whether the thing they have on Google Drive is the right thing, and no idea what'll come out the other end.

Five-minute onboarding path 1 Sign up 30s 2 Pick starter pack 45s 3 Upload examples 2m 4 Configure persona 1m 5 Launch live Working AI surface in five minutes, onboarding is a feature, not a checklist.
Five-minute onboarding

If their first session takes more than about five minutes to produce something they can show another human being, they're gone. Not "churned in week two" gone. Gone today, before they ever come back.

So the onboarding flow is not a thing you bolt on after the product works. It is the product, for the first session. Everything I said in piece #2 about captured judgment being the value, yes, that's true, but the customer can't see it on day one. What they can see is whether the thing they typed produced a useful-looking output. That's the deliverable for minute five.

What "working AI in five minutes" actually means

Let me pin this down because it's tempting to weasel out of.

Five minutes from "I clicked sign up" to "I can paste a question into my surface and get an answer that sounds like me, on a topic I care about, using examples I gave it." Not a demo with somebody else's data. Not a generic chat that could have come from raw Claude. Their voice. Their topic. Their examples. Working.

This is hard. The architecture from the MVP series helps. Cognito auth, tenant-scoped data, RDS+pgvector for retrieval, Bedrock for inference, all already wired (see piece #6 and #7), but the spine doesn't bootstrap the consultant's content. That's the problem.

The trick I've landed on: starter packs plus a guided first pass.

How the five-minute path actually works

Four screens. That's the budget.

Screen one: pick your vertical. Sales discovery, marketing positioning, product decision-coaching, IT-ops triage, contract review against a playbook, second-opinion medical review, portfolio diagnosis, interview rubric, resume coaching. Pick one. This is not "what's your job title." This is "which of the prebuilt starter shapes is closest to what you do." It seeds everything downstream.

Screen two: import what you've already got. Three buttons: upload files (PDF, DOCX, MD), paste text. Connect a source (Google Drive, Notion, Dropbox, whatever I've wired up). The customer drops in anywhere from one document to 50. The system doesn't care which yet, it just needs something to embed.

Screen three: shape your voice. Three sliders or three short prompts: how formal, how long, how much hedging vs. how directive. Plus a free-form "Anything I should know about how you talk to clients?" field. This is the persona-shaping step. Customer-facing it's three settings; behind the scenes it's a more structured object that shapes prompts and retrieval downstream. (I'm being deliberately vague about that structure, there's a patent boundary I'm staying behind.)

Screen four: try it. A sample question (auto-generated from their starter-pack vertical) sitting in a prompt box, with a "Run" button. They click. Five to fifteen seconds, and an answer comes back. It's not perfect, but it's clearly theirs: it cited one of the documents they uploaded, it used the voice settings they picked, and it sounded like the work they actually do.

That's the five-minute moment. Everything from here is iteration.

The starter-pack trick

The thing that makes screen one through screen four cost ~five minutes instead of five days is the starter pack.

For each vertical, I ship a curated bundle. Think of it like a default kit. It's got:

  • A reference corpus of generic-but-realistic examples (anonymized, public-domain, or synthetic) for the vertical. Eight to twelve documents. Enough to seed the embeddings before the consultant's own material lands.
  • A baseline persona shape (formal-but-warm, mid-length, low-hedge) that's a sensible default for that vertical.
  • A starter prompt template wired into the right retrieval pattern. (RAG is retrieval-augmented generation, the system pulls relevant context from your stored material before asking the model, if you want to read up.)
  • A first sample question pre-filled so the customer doesn't have to think of one.
  • Three "next things to try" prompts so once the first question works, there's a path forward.

The starter pack is what gets shown on screen four if the consultant uploaded nothing. It's also what fills the gaps if they uploaded a little. As they add more of their own material, the starter content gets demoted in retrieval weighting and then eventually pulled. The pack is scaffolding.

The captured-judgment idea from piece #2 is what the starter pack is a stand-in for. The pack gets the surface working; the consultant's real material is what makes the surface theirs. The first hour is scaffolding; the first month is replacement.

Three verticals, three first sessions

Marketing strategist. Picks "marketing positioning." Drops in eight case studies they wrote for past clients, two of their own writeups on their positioning approach, and a slide deck. Sets the voice sliders: high formality (they work with B2B), medium length, low hedge ("just tell them what I think"). Screen four asks a sample question: "What positioning angle would you recommend for a 50-person dev tools company entering a crowded category?" The answer comes back grounded in two of their case studies plus a starter-pack one (clearly marked), using their voice settings. Total time: 4m 20s. The first thing they do next is paste in a real client situation and ask for real advice.

HR consultant packaging an interview rubric. Picks "interview rubric." Uploads their rubric document, a few sample interview notes, and a one-page philosophy doc. Voice settings: warm but direct, medium length, decisive. Screen four shows a sample candidate writeup with the rubric applied, partly using their actual rubric, partly using a starter-pack scaffold for sections they didn't upload. They immediately spot a category they didn't include in their upload and add it. The product just told them something about their own work.

Financial advisor doing portfolio diagnosis. Picks "portfolio diagnosis." Drops in their portfolio-review template, a few anonymized prior diagnoses, and a brief writeup of their philosophy. Voice settings: high formality, longer responses, conservative-hedged ("when uncertain, flag, don't bet"). Screen four runs a sample portfolio against the prior diagnoses and the template. The output reads like their own writeup. They immediately notice their template doesn't ask about liquidity needs explicitly enough and make a mental note to revise it.

Three verticals, three different starter packs, three different voice profiles, three different surfaces by minute five. Same architecture spine.

What happens on the back end during those five minutes

Curious-reader summary: the system is doing a lot, fast, and most of it is invisible.

The technical version, for anyone running this:

  • Tenant provisioning. Cognito creates the user. RDS gets a tenant row with row-level security scoped to that tenant from the first query. (This is the "do it on day one" point from piece #6.) Zero "we'll add this later."
  • Starter-pack seeding. Starter documents for the chosen vertical get embedded into the tenant's pgvector store and marked as starter-pack-origin. They retrieve but at a downweighted rank.
  • Upload + embed pipeline. Customer uploads hit S3, kick an EventBridge event, an embed Lambda chunks and embeds the content into pgvector under the tenant scope. Streaming progress shown to the customer.
  • Persona shape. The three voice settings get stored in a structured way and bound into the prompt template for that tenant. (Specifics deliberately left vague, patent boundary.)
  • First inference. Sample question hits the same triage→diagnose path the production app will use. Haiku triages, Sonnet diagnoses with retrieval from the tenant's pgvector (mix of customer content and starter pack), output streams back. The router from last week's piece is doing its job from minute one.
  • Audit row. Every step gets logged to the audit table from piece #13. Yes, even during onboarding. Especially during onboarding.

Five minutes wall-clock. A lot of moving parts. The customer sees none of them, which is the point.

The trap I keep almost falling into

The temptation, every time, is to make screen four "better" by making it ask the consultant for more upfront. More documents. More voice calibration. A multi-step tone interview. "Just five more minutes to really tune this." It feels like quality investment.

It is not. It is churn manufacturing.

The consultant doesn't know what they don't know yet. They've never used a product like this. The only way they figure out which of their material matters is by using the surface with what they uploaded already, seeing what it gets wrong, and adding the missing piece. Iteration with their real material beats upfront perfection every single time.

So the rule I hold: screen four happens by minute five, even if the output isn't great yet. Get them to the surface. Let them see it working. Then the next 30 days is "you noticed it didn't handle X, here's where to drop in your X material." That second loop is where the secret sauce actually lands.

This is the same shape as the day-one approval gate from piece #12. You ship the loop early, knowing it's not yet good, and the loop itself produces the data that makes it good.

How I know onboarding is working

Three numbers I watch.

Time-to-first-output. Median from "clicked sign up" to "saw a generated answer." Target is five minutes; I get alerted if the 75th percentile climbs above eight.

Day-one engagement after the sample. Did they paste in a second question after the auto-generated one? If yes, the surface earned trust. If no, the sample didn't land and I look at what went wrong for that vertical.

Week-one material adds. Did they come back and upload more of their actual content? This is the leading indicator of long-term retention. Tenants who add material in week one keep paying. Tenants who don't, don't.

I don't watch DAU in the first week. I watch material adds.

If you're shipping this

One thing this week: time your own onboarding flow with a stopwatch. From "click sign up" to "see a generated output that uses my actual content." If it's more than five minutes, find the screen that's eating the time. Almost always. It's a screen asking the customer to do work the product could have done for them with a starter pack.

The next piece in this series is about the other half of this product: once the consultant is onboarded, they're not just a customer, they're a supervisor of their own AI. Which means there are actually two distinct surfaces sitting on the same backend, the customer view (ask a question, get an answer) and the consultant view (approve, deny, mine patterns, tune). Next week.