Vendor lock-in in the AI era is worse than 2010 cloud lock-in

Cloud lock-in in 2010 was bad. AI lock-in in 2025 is worse for reasons most teams aren't thinking about. The data, the prompt patterns, the agentic surface, the fine-tunes, none of it ports cleanly. Worth being clear about why before you commit.

Sid Smith

30 Jul 2025 • 5 min read

The cloud lock-in conversation circa 2010 was about data gravity, API differences, and proprietary services. Move from AWS to Azure and you'd rewrite your IAM, your storage layer, your queueing, your monitoring. The good architects built abstraction layers; the rest accepted lock-in as the cost of velocity. Either way, the cost of switching was bounded by the size of your codebase.

AI vendor lock-in in 2025 is worse. The data gravity is real. The API differences are real. The proprietary services are real. On top of those, AI lock-in adds dimensions that didn't exist in the cloud-only era, prompt-fit lock-in, agentic-surface lock-in, fine-tune lock-in, evaluation-suite lock-in. Add it all up and the cost of switching is meaningfully higher than the equivalent switch in the cloud era was.

2010 lock-in was about your data. 2025 lock-in is about your model.

Worth being clear about why, because most teams committing to AI vendor relationships in 2025 are using a 2010-shaped mental model of what the commitment costs.

ai-lockin-dimensions

The new dimensions of lock-in

Five categories of lock-in that are specific to AI vendors:

Prompt-fit lock-in. Every model has a personality. The prompts that work well on Claude don't always translate to GPT-5 or Gemini. Production systems pile up hundreds of carefully-tuned prompts that depend on the specific model's response patterns. A vendor switch means re-tuning all of them, with no automated path. Estimate: a serious AI app has 100-500 prompts that all need re-validation on a new model.

Agentic-surface lock-in. The agent SDKs each vendor ships (OpenAI Agents, Anthropic Agents, Google ADK, AWS Strands) don't talk to each other. A team that built their agentic system on one vendor's SDK has to rebuild the agent surface on another vendor's SDK if they switch. The thin-abstraction-layer pattern helps but doesn't make the cost go away.

Fine-tune lock-in. Custom fine-tunes on OpenAI don't run on Anthropic. Custom fine-tunes on Anthropic don't run on OpenAI. The fine-tune investment doesn't port. For shops that use fine-tunes meaningfully, this is the largest single lock-in cost, re-fine-tuning is expensive and the curated training data may not transfer cleanly.

Evaluation-suite lock-in. Production AI systems have evaluation suites that score outputs against criteria. The suite is tuned to the model's specific failure modes; switching the model changes the failure modes; the suite needs to be re-validated. The eval-suite investment is the part of AI infrastructure most easily underestimated and most expensive to redo.

Memory-and-state lock-in. The conversation history, the user-memory store, the per-user fine-tuning context, the vendor-specific shape of this varies enough that migration is non-trivial. OpenAI's saved-memories, Anthropic's MCP-memory primitives, the proprietary versions in various platforms, none of it ports cleanly.

The dimensions that haven't changed

Some lock-in axes that are still the same as the 2010 cloud era:

Data gravity. Where your training data lives, where your indexed corpus lives, where your application data lives, these all still pull you toward the vendor that hosts them.

API differences. Different SDKs, different request shapes, different response shapes. Standard cloud-era stuff. Manageable with abstraction layers.

Service proprietary-ness. Bedrock-specific knobs, Vertex-specific knobs, OpenAI-specific knobs. Same shape as the cloud era; manageable with the same abstractions.

These haven't gotten worse. They're the floor of the lock-in cost; the AI-specific dimensions sit on top.

Why MCP doesn't fully solve it

The MCP standardization push addresses one axis of lock-in: the tool-integration surface. MCP servers are vendor-neutral; tools you've built as MCP servers survive a vendor switch. That's real progress.

What MCP doesn't solve: the prompt-fit, the fine-tunes, the evaluation suite, the agent-specific framework choices. The tool layer is portable; the model-interaction layer above it isn't. MCP is the most important interoperability standard in the AI space and it solves maybe 20-30% of the actual lock-in cost.

What you can actually do about it

Five practices that meaningfully reduce AI vendor lock-in:

Build the thin abstraction layer. Wrap every model call in a small interface that hides vendor specifics. Same pattern as cloud abstractions; harder because the responses vary in shape and quality, not just format. Worth doing for any non-trivial production AI system.

Use MCP for tools, religiously. Every tool the agent calls should be exposed via MCP rather than as a vendor-specific tool definition. The marginal effort is small; the portability gain is large.

Keep fine-tunes minimal. The cost of re-fine-tuning on a new vendor is high. Prefer prompt-engineering and retrieval over fine-tuning whenever the quality bar can be met without it. Save fine-tunes for the cases where they're genuinely required.

Maintain a vendor-neutral evaluation suite. Score model outputs against criteria you define, not against vendor-specific benchmarks. The suite that runs against any vendor's model is the suite that protects you when you need to switch.

Run periodic vendor-switch dry-runs. Once a quarter, rebuild a non-trivial workflow on a different vendor. Document what broke. The discipline keeps the lock-in cost visible and bounded; without it the cost grows silently.

These don't make lock-in go away; they keep it in the bounded-and-known category instead of the scary-and-uncalculated category.

What this means for procurement

For teams making vendor commitments in 2025, the practical implication is the contract horizon you should be willing to commit to is shorter than the cloud-era equivalent. A 3-year cloud commitment used to be reasonable for the discount. A 3-year AI vendor commitment is questionable because the model and capability space moves fast enough that 3-year-old choices aren't optimal at year 3.

The right horizon for AI vendor commitments in 2025 is 12-18 months. Long enough to amortize the integration; short enough to be willing to revisit. The vendors that pressure you toward longer commitments are pressuring against your interest.

The other procurement implication: the cost of multi-vendor strategy is higher than it was in the cloud era, but the strategic value is also higher. The shop that's stuck on one AI vendor when a competitor ships a meaningfully better model has fewer options than the cloud-era equivalent. Multi-vendor hedging is more expensive operationally and worth more strategically.

The honest assessment

Lock-in is real. It's worse than 2010 cloud lock-in along several axes. The mitigation patterns exist and work; they take deliberate engineering investment that most shops are skipping. The shops that skip it are storing up a switching cost that comes due in 18-24 months when the vendor space inevitably shifts.

The cost of getting this wrong scales with how committed you are to the AI workload. A casual user is fine with full lock-in; the switching cost is small and the workloads are casual. A production AI system serving a business function is a different category, the lock-in cost there is the kind that locks in vendor leverage in renewals and limits design options when the model space moves.

Worth treating AI vendor selection as the design commitment it actually is, not as the SaaS-purchase decision the procurement process treats it as. The cost modeling discipline that makes spend visible should extend naturally to making lock-in visible. The same finance-and-engineering pattern that worked in the cloud era applies here, with more dimensions and higher stakes.

The 2010 cloud era taught us how this story goes. The shops that got cloud lock-in right early outperformed for a decade. The shops that got it wrong spent the decade unwinding the wrong choices. The same pattern is playing out in AI now, on a faster timeline, with more dimensions to get wrong. Worth being early on the right side.