Apple Intelligence at WWDC follow-through: what shipped vs what was promised

Three months past WWDC 2025. The Apple Intelligence pitch was real on the demo stage. The shipped version is more constrained and the timeline gap is widening rather than closing. Worth being concrete about what landed and what didn't.

An ornate empty wooden display pedestal on a dark stage with a single spotlight illuminating it from above

WWDC 2025 was in early June. Apple's Intelligence keynote was the most-watched session at the event by some margin, and the pitch was credible, a more capable Siri, deeper third-party integration through App Intents, on-device personal context across more apps, the long-promised agent surface that would make the device-level AI actually useful for daily work. Three months later, worth pulling apart what actually shipped and what didn't, because the gap matters and is widening rather than closing.

The honest version, three months past the keynote: the foundational pieces shipped. The headline features didn't. The pattern is the same one I wrote about in March and the same one Apple keeps repeating across two-and-a-half years of Apple Intelligence, the foundation is fine, the user-facing capability lags the promise.

What shipped

The list of features that are actually live in the iOS / macOS releases that have rolled out since WWDC:

Modest Siri improvements. Better single-step queries. Slightly better contextual handoffs between Siri and apps. The improvements are real but in the "noticeably less bad" rather than "major" register.

Writing tools refinements. The text-rewriting, summarization, and tone-adjustment features got revised. They're more reliable on long documents than they were a year ago. Still meaningfully behind what a hosted model produces, but useful in the cases where on-device privacy is the binding constraint.

App Intents expansion. More apps now expose more actions to the Apple Intelligence layer. The third-party developer story improved. The number of cross-app workflows that actually work has grown from "single digits" to "dozens." Not the promised "anything in any app", meaningfully more than at the start of the summer.

Smart-reply and scheduling improvements in Mail and Messages. Useful at the margins. The kind of feature you forget is AI until you notice it suggesting something good.

Image Playground refinements. Better quality, fewer weird artifacts. Still nowhere near hosted-image-generation quality. The use case it's actually good at is "quickly draw the cartoon thing for the message", narrow but real.

That's the shipped list. It's a reasonable step-by-step improvement story; it's not the strategic-leap story the keynote framing promised.

What didn't ship

The list of things that were prominently demoed and haven't shipped in the time since:

The personal-context Siri overhaul. The demo where Siri seamlessly answered "when is my mom's flight landing?" by reaching into mail, messages, calendar, and notes, still not a thing. The pieces are there individually; the integration that would make it work daily is not.

The agentic cross-app surface. "Siri, do this complex multi-app workflow", also not a thing. The plumbing is there in App Intents; the orchestration layer that would actually run the workflows isn't.

Real on-device deep reasoning. The on-device model is still the 3B-class one. The "Private Cloud Compute" tier exists but is still mostly used for the same writing-tools and summarization workloads, not for the harder reasoning the keynote suggested.

Third-party model integration. The promised ability to choose between Apple's model, ChatGPT, Claude, and Gemini at the system level, partially shipped. The choice exists; the integration depth between the alternative model and the rest of the system is shallow. ChatGPT works in the writing-tools surface and Siri's escalation path; Claude and Gemini are still mostly waiting.

The promised story about third-party AI features. "Developers will build incredible things on top of this", the few developers who've shipped meaningful AI-augmented apps mostly haven't relied on Apple Intelligence; they've integrated their own model layer. The platform play hasn't materialized.

That's a long list of unshipped commitments three months in.

Why the gap keeps widening

The pattern across Apple Intelligence releases is that the demo bar runs ahead of what the engineering can deliver, and the gap widens because the demo bar keeps moving with each WWDC. A few reasons in the shape of the work itself:

The on-device-first commitment is real and constraining. Apple's privacy story requires that most of the AI runs on the device or on attested-private servers. That commitment is the right strategic position; it also limits what's deliverable because the on-device model is just smaller than the closed-frontier hosted models everyone else is competing with.

Siri's foundations are still 2014. The base Siri sits on is a decade-plus old. Modernizing it without breaking everything is hard. Apple has been doing this in steps; the rate of improvement is real but slow relative to the rate at which the AI conversation has moved.

Quality-bar enforcement. Apple ships when something is consumer-grade, not when it's demo-able. The features that shipped from WWDC 2025 are the ones that hit Apple's quality bar; the ones that didn't are still being polished or rebuilt. That discipline produces fewer regressions than the competition; it also produces the persistent gap between demo and ship.

The strategic-positioning bet on personal-data integration. Apple's bet is that the AI features that compound personal context (across mail, messages, calendar, photos, notes) are the durable differentiator. That bet might be right. The execution requires solving hard problems in privacy-preserving on-device inference and cross-app data integration that nobody else has fully solved either. The slow pace reflects real engineering difficulty.

Where this puts Apple, three months in

The competitive position is strange. Apple's hardware position for personal AI is the strongest in the industry, the Apple Silicon plus open-weights story I keep coming back to means the Mac platform supports principled personal AI better than any alternative. Apple's software position is meaningfully behind the hosted-AI competition on the user-visible features that matter to consumers.

That gap is interesting because it's the gap between "the foundation is best in class" and "the consumer experience hasn't caught up." For the principled-user population I described in the called-my-shot piece, the Apple platform is the right base, you build your own personal-AI workflows on top of Apple Silicon and capture the value Apple Intelligence isn't capturing for you. For the casual-user population, Apple Intelligence is still the visible AI feature on their device, and it's still meaningfully less impressive than what they get from ChatGPT or Claude or Gemini directly.

That fork, principled-user happy with the foundation, casual-user disappointed with the surface, is a real strategic problem for Apple. The disappointment compounds the longer the gap persists.

What I'd watch for the rest of the year

A few things that would tell me Apple is closing the gap rather than letting it widen:

A meaningful Siri ship by end-of-year. The personal-context Siri demo from WWDC needs to ship before WWDC 2026, or the credibility cost of two years of unshipped Siri demos becomes serious.

A bigger on-device model. The 3B-class model is the binding constraint. A 7B or 12B on-device model would meaningfully expand what Apple Intelligence can do without leaving the device. The hardware can support it; the question is whether Apple ships it.

Better third-party-model integration depth. ChatGPT in writing-tools is a starting point. Cross-system Claude or Gemini integration with personal context would be a meaningful capability story for users who don't want to commit to ChatGPT.

A serious consumer pitch for principled personal AI. The foundation is there; the consumer-friendly framing isn't. Apple is the most plausible vendor to build the bridge from "casual hosted user" to "principled personal user." Whether they do it in 2025 or wait another year is the strategic question of the category.

Three months in, the WWDC 2025 follow-through is doing what every WWDC Apple Intelligence follow-through has done, shipping the foundation, slipping the headline. The question is whether the next ship cycle starts to close that gap or whether the pattern locks in for another year. The window is closing; the consumer competition is moving faster than Apple is. Worth watching closely.