What I said in March: what I think now

A nine-month checkpoint on the takes I had in March 2025 about how the year would shape up. Some held; some bent; some were plain wrong. Worth being explicit about which is which.

An open vintage daily diary on a dark wooden desk with two pages visible

A nine-month checkpoint on the takes I had in March 2025 about how the year would shape up. The format is the same as the August piece on the imprint thesis but at a shorter horizon, what I said earlier this year, what's held, what's bent, what was plain wrong. Worth being explicit because the public-checkpoint discipline keeps me honest in ways that quietly-revising-without-saying-so doesn't.

What held

A few takes from March that look more right at year-end than they did when I made them:

The open-weights workhorse-tier would close the gap with closed-frontier in 2025. This was the bet. It's been substantially correct. R2 from DeepSeek was the punctuation; the trend through V3-0324, Llama 4, and the various Qwen releases all delivered. The closed-frontier shops still own the very top end; the workhorse tier is genuinely contested.

Apple Silicon's local-AI foundation would matter more than the cloud-first conversation suggested. The inflection point piece put numbers on this. It's held, the M4-generation Mac Studio plus the open-weights tier delivered the personal-AI foundation the March piece anticipated.

MCP would standardize faster than the typical industry-standard cycle. The momentum from spring carried into a real tooling by year-end. Every major vendor has shipped MCP support; the routing-layer space is forming; the tool portability is real.

The agentic-everything keynote pitch would land badly with practitioners. Microsoft Build and Google I/O both pitched agents-everywhere; the practitioner conversation about the governance gap stayed loud through the year. The gap I called out is still mostly unfilled. The market caught up to the framing.

The price floor on hosted inference would compress meaningfully. Predicted; happened. The DeepSeek-driven price moves were faster and bigger than I sketched in March. The compression continues.

That's the held list. Reasonable hit rate on the directional calls.

What bent

Calls that were directionally right but wrong on specifics:

Timing of the Claude 4 release. I expected it earlier in the year; it landed in May. Not a big miss; the directional take was right.

The Mariner / Operator / Computer Use category. I bet on this maturing into a daily-use category by year-end. The category exists and is real; the daily-use pattern is narrower than I expected. Useful for bounded tasks; not yet the general-purpose surface I sketched.

The on-prem narrative for AI workloads. I thought the on-prem case would be more widely adopted by year-end than it turned out to be. The case is correct; the inertia against moving from hosted is bigger than I weighted. The on-prem piece covers the current state.

The agent framework consolidation timeline. Predicted faster consolidation; the framework-wars continue with multiple credible options. The eventual consolidation will happen; the timeline is later than I expected.

The Apple Intelligence delivery. I was less negative on this in March than I am now. The follow-through has been weaker than I expected; the WWDC follow-through piece covers the current shape. The foundation is right; the consumer product hasn't materialized.

These are real. The directional takes were defensible; the specifics were off in ways that matter for planning.

What was plain wrong

A few calls that didn't survive contact with reality:

Multi-machine MLX training would still be experimental at year-end. I called it as still-research-grade; it's actually production-usable for serious workloads now. The MLX-is-real piece and the broader distributed-training-without-hyperscaler piece cover the current state. I was wrong about the maturity.

The agentic-IDE category would consolidate to two-three winners. The market is still fragmented. Cursor, Claude Code, GitHub Copilot agent mode, JetBrains AI, Cody, and several others all retain meaningful share. The consolidation is happening more slowly than I called.

Distillation would stay a research curiosity. Wrong. Distillation as a production pattern is one of the more important production patterns of the year. I underweighted it badly in March.

Vendor lock-in would be addressed by industry standards. MCP addressed part of it. The bigger lock-in dimensions (prompt-fit, fine-tunes, evaluation suites, memory primitives) are still mostly vendor-specific. The lock-in piece covers the current state. I expected more standards progress than materialized.

Price floor on workhorse-tier hosted models would stabilize by Q3. Wrong. The compression continued through Q4. The floor isn't yet stable.

These are calls I'd revise if I were writing them today. The pattern across them: I was systematically too conservative on the rate of open-weights / open-stack progress and too optimistic on the rate of standards-and-consolidation progress.

What the recalibration says about end-of-year

A few takes I'd commit to now, with the benefit of nine months of evidence:

The open-weights side is closer to closed-frontier on most workloads than the marketing suggests. Not on the very top end. On most workhorse-tier work, the gap is small enough to be a routing decision rather than a capability decision.

The personal-AI foundation is materially better-built than the consumer products surfacing it. The hardware is there. The open-weights tier is there. The principled-user community is real. The consumer-friendly bridge is missing. Same point I made in the called-my-shot piece.

The governance gap is bigger and slower-closing than the keynote conversation acknowledges. Six months past my Build piece; the gap remains structurally similar. The slow-close is the actual story.

The pricing-pressure narrative compounds. Each open-weights release pulls hosted pricing down further. The trajectory is intact through year-end. I expect this continues into 2026 with no obvious circuit-breaker.

The MCP-everything-else question is settling toward MCP. The MCP-only architecture piece lays out the case. The practitioner adoption is increasingly MCP-first.

These are the late-2025 takes. The 2026-prediction piece will have more committed forecasts.

What I learned from the misses

A pattern across the wrong calls: I systematically underweighted how fast open-source / open-weights / open-platform momentum compounds when the tooling is energized. The MLX maturity, the distillation pattern, the open-weights workhorse-tier closing, all of these moved faster than I called.

The lesson: when the open-source side has motivated practitioners, the rate of progress beats my intuition. The closed-source side moves with vendor cadence; the open side moves at community cadence, which when motivated is meaningfully faster.

I'll try to apply this to the 2026 forecasts. The open-source AI conversation has at least as much momentum heading into 2026 as it had heading into 2025. My prior is to weight that more heavily than I did.

The discipline summary

Nine-month checkpoints work. The "what I said vs what I think now" format keeps me honest in ways that re-reading my own posts without explicitly grading them doesn't. Recommended practice for anyone making public AI predictions, the gap between the predictions and the recalibrated takes is informative.

The next checkpoint is the 2025 wrap-up and the 2026 predictions, both coming in the next few weeks. Worth doing them with the corrections from this piece in mind. The forecasts are only useful if they're grounded; the grounding requires admitting where the prior forecasts were off.

What I said in March was directionally good and specifically off in patterned ways. What I think now is better-calibrated; whether it's right is the next nine months' question.