AI in the News

AI in the news: week of November 9, 2025

OpenAI-AWS $38B makes multi-cloud official. Moonshot's Kimi K2 Thinking is the first open-source model to beat GPT-5 on agentic benchmarks. Apple reportedly pays Google $1B/yr for Siri. Ironwood TPUs ship. The concentration story cracked.

Sid Smith

09 Nov 2025 • 10 min read

What this week actually changed: the concentration story cracked at both ends of the stack. OpenAI signed a $38B deal with AWS, the second hyperscaler relationship after Microsoft, and the formal end of the cloud-exclusivity era. And Moonshot dropped Kimi K2 Thinking as an open-weights MIT-license release that takes the top of several agentic and reasoning benchmarks ahead of GPT-5 and Claude Sonnet 4.5. In between: Apple reportedly nearing $1B/year to put Gemini inside Siri, Google's Ironwood TPU going GA with Anthropic on board for a million units, OpenAI hitting one million business customers, the Sora backlash starting to organize, and Asia-Pacific equities selling off on AI-bubble fears.

OpenAI signs $38B with AWS

Monday, November 3. OpenAI announced a seven-year: $38B compute deal with Amazon Web Services. The first contract OpenAI has ever signed with AWS, and the formal completion of the unwinding from Microsoft-exclusive infrastructure that started in January when Microsoft gave up its cloud-exclusivity clause. Per CNBC, OpenAI starts running workloads on AWS immediately, with hundreds of thousands of Nvidia GPUs in the mix and capacity expansion through the term. Amazon stock closed at a record high on the news.

The frame this story usually gets ("OpenAI diversifies, AWS wins") is correct as far as it goes, but it's not the structurally interesting part. The structurally interesting part is that OpenAI's compute commitments now total something like a trillion dollars across AWS, Microsoft, Oracle, CoreWeave, Google, and Nvidia. The company is not picking one cloud. The company is becoming a counterparty to all of them at a scale that makes the relationship symmetric. That is a different shape of business than "AI startup that runs on Microsoft" was three years ago.

I'm not sure how to feel about it. The optimistic read is that no single hyperscaler can hold OpenAI captive, which is structurally healthier than the 2023 arrangement. The pessimistic read is that the trillion-dollar capex commitment locks OpenAI into a monetization curve so steep that the company has no choice but to push aggressively on every consumer and enterprise surface, including the surfaces (Sora-style social, biometric capture, agent-mediated browsing) where the data-flow concerns get sharpest. You can read this deal as OpenAI escaping vendor lock-in or as OpenAI taking on so much infrastructure debt that the product roadmap stops being a choice. Both are true. What I'll watch: whether the AWS workloads include training, inference, or both (the press release is vague); whether Anthropic's existing AWS relationship gets quieter as a result; how this changes the hyperscaler-vs-frontier-lab balance.

The story I want everyone to read: Kimi K2 Thinking

Thursday, November 6. Moonshot AI released Kimi K2 Thinking, a one-trillion-parameter open-source reasoning model with native tool-use, capable of 200-300 sequential tool calls, available under a modified MIT license with weights on Hugging Face. Per VentureBeat's coverage, the model takes the top of several agentic and reasoning benchmarks ahead of GPT-5, Claude Sonnet 4.5 in thinking mode, and Grok 4. The benchmark caveats are real (every release week is benchmark-gaming season) but the qualitative pattern is consistent across multiple independent reviews.

This is the story I want every principled-AI practitioner to internalize. Open-weights frontier-class models are not a future hope anymore. They are a Thursday release. The gap between "what you can run on your own infrastructure" and "what the hosted frontier labs are selling" is closing faster than most enterprise architecture decks have caught up to. The mental model that Claude or GPT is the ceiling and self-hosted models are the cost-saving compromise is, as of November 6, 2025, factually wrong on agentic-reasoning benchmarks.

I'm going to keep banging this drum. The argument I made in the small-models tour was that the open ecosystem was eating the closed one from below, slowly. Kimi K2 Thinking is not eating from below; it's competing at the top. A trillion-parameter Chinese open-weights model that beats GPT-5 on agentic benchmarks under MIT license is the news of the year for anyone whose AI strategy has a "what if the API price doubles" risk in it. Run it locally. Run it on your own GPUs. Take it out of the country at will. The model is yours.

The geopolitics is the thing I want to flag carefully. K2 Thinking comes out of Moonshot, a Chinese lab. The instinctive American-tech-press read is to discount it on geopolitical grounds. The discount is a strategic mistake. The Chinese open-weights labs (Moonshot, DeepSeek, Alibaba's Qwen) are running an open-source-by-default playbook that the American frontier labs plainly will not match. If the answer to "should I depend on Chinese open weights" is "no, for governance reasons," the next question has to be "what's the plan for the American open-weights equivalent that doesn't exist." The hand-wringing without the plan is just hand-wringing.

Apple reportedly pays Google $1B a year for Siri's brain

Wednesday, November 5. Bloomberg broke the story that Apple is finalizing a deal to pay Google approximately $1 billion per year for a custom 1.2-trillion-parameter Gemini model to run the long-promised Siri overhaul. Per TechCrunch, the Gemini model would handle Siri's summarizer and planner functions while some on-device features stay with Apple's own foundation models. Officially confirmed in early 2026.

This is the cleanest possible illustration of what the concentrated-frontier-model market does to companies that decided to do their own foundation model and then lost the race. Apple has been working on its own LLMs for three years. The Apple Intelligence rollout has been the most-criticized AI product release of 2024-25. And the company that owns the most consumer compute on the planet has, per the report, decided that paying Google a billion dollars a year is cheaper than catching up at the model layer.

Two reactions. The pragmatic one: this is correct. If you're Apple and you can't ship a competitive model in 2025, paying Google to ship one for you and keeping your differentiation at the device-and-privacy layer is the right call. The structural one: this is exactly the consolidation story I keep flagging. Frontier-model concentration is now eating Apple-scale players, not just startups. The number of independent foundation-model labs in the West is shrinking, not growing, and the labs left standing have pricing power over even the largest customers in tech.

The privacy story is its own thing. Apple's pitch has always been "your data doesn't leave the device." The new arrangement keeps "personal context" features running on Apple silicon and routes the heavier reasoning queries through Google's Private Cloud Compute equivalent. The marketing will be careful. The structural fact is that Siri's brain will be running, in part, on Google's infrastructure. The privacy story will not survive scrutiny as cleanly as Apple's existing on-device pitch did, and the PII handoff question gets a new and very visible test case.

Google's Ironwood TPU goes GA

Thursday, November 6. Google announced Ironwood, its seventh-generation TPU, going to general availability "in the coming weeks." Per CNBC and The Register, Ironwood is 10x peak performance over TPU v5p, 4x per-chip improvement over Trillium, scales to 9,216 chips per superpod with 9.6 Tb/s interconnects, and 1.77 PB of shared HBM. Anthropic publicly committed to using up to a million Ironwood TPUs to run Claude.

Two things stand out. Google is the only hyperscaler with a credible Nvidia alternative in production at frontier-lab scale. The Ironwood-Anthropic commitment is the proof point that the alternative is real. And the marketing pitch ("the first TPU for the age of inference") is the strategically interesting frame. Google is betting that the next compute regime is dominated by inference workloads (agents calling models, models running tools, all of it scaling not by training but by serving) and that Ironwood's perf-per-watt at inference matters more than peak training FLOPS. The Anthropic-on-Ironwood commitment is also the read between the lines on the OpenAI-AWS deal. The pattern this week is "frontier labs are diversifying compute aggressively, and the alternative-silicon story is now part of every major architecture deck." Nvidia-monoculture at the silicon layer is the same shape of risk as hosted-frontier-model monoculture at the application layer, and the same distribution logic applies.

OpenAI hits 1M business customers, ships Teen Safety Blueprint

Thursday, November 6. OpenAI announced it crossed one million business customers, with seven million ChatGPT-for-Work seats (up 40% in two months) and Enterprise seats up 9x year-over-year. The same day OpenAI published its Teen Safety Blueprint, outlining age-prediction, parental controls, and design principles for under-18 users.

The business numbers are real and the rate of enterprise adoption is genuinely high. The Teen Safety Blueprint is the right document to publish and I'm not going to be cynical about it specifically. But the juxtaposition with the Sora situation (same company, same week) is hard to ignore. OpenAI is publishing teen-safety frameworks while shipping a video generator that, per Reality Defender's reporting, had its anti-impersonation safeguards bypassed within 24 hours of launch. Both things are true. The frameworks are real and the products are out ahead of the frameworks. The mature read is to take the artifacts seriously without confusing them for the binding constraints. The constraint is the governance work customers and regulators do, and right now that work is lagging the deployment.

The Sora backlash starts to organize

Tuesday, November 11 was when Public Citizen's letter formally landed, but the organizing happened through this week. Public Citizen's letter demanding OpenAI withdraw Sora 2 cites the deepfake bypass, the consent issues, the political-disinformation risk going into 2026 midterms. The letter is also CC'd to Congress.

I called this in week one. The Sora app and the Cameo feature were the most aggressive sensitive-data-into-public-AI move I'd seen, and the social-feed deployment was going to age badly. We're now at the part of the cycle where advocacy groups are organizing and Congressional letters are getting drafted, which is the slow-motion regulatory process that ends in either a real consent regime or a meaningful legal liability for the platforms. Worth tracking which one materializes first. The thing I want to add to my week-one read: the deepfake-bypass story is the one that should worry OpenAI most. The Cameo consent regime is, in principle, defensible: users opt in, they can revoke. The fact that adversarial deepfakes of people who never opted in were trivially produced within 24 hours is the failure mode that nukes the consent argument structurally. Once an attacker can produce a deepfake of anyone using the public model, the consent ceremony is theater.

The bubble noise: Asia-Pacific selloff

November 4-5. Asia-Pacific equity markets sold off hard on AI-valuation fears. South Korea's KOSPI dropped roughly 2.85% on November 5 (down more than 6% intraday at the lows). Japan's Nikkei fell 4-4.5%. Taipei chip stocks tumbled. Per the Japan Times and NBC News, the trigger was a combination of Palantir's selloff after a record-earnings report (forward P/E above 200), the cumulative weight of trillion-dollar capex commitments, and a widely-circulated MIT study showing 95% of generative-AI-adopting firms saw no significant revenue gains.

I'm not going to pretend to call the macro. What I'll note is the timing. The capex side of the AI economy is committing trillion-dollar numbers in the same six-month window where the productivity-and-revenue side is producing the MIT 95%-no-significant-revenue-gains study and the Amazon "we're cutting 14k corporate roles to invest in AI" headline. The market is starting to ask whether the capex will return on the timeline that justifies the spend. That question will shape 2026 in ways I don't think we've internalized yet. The version of this question that I care about (separately from the equities call) is what happens to the labor narrative when the bubble noise gets louder. Capex impatience and layoff impatience are the same impulse, and the selloff will make the cutting worse before it makes it more disciplined.

Smaller items

A few worth a line. Google DeepMind's mathematics collaboration with Terence Tao, announced November 7, AlphaEvolve / AlphaProof / Gemini Deep Think collaborating with one of the most influential living mathematicians on novel research problems. Long-tail consequence to watch. Atlas browser updates. OpenAI's ChatGPT Atlas shipped its first significant post-launch update, vertical tabs, iCloud passkeys, an Insert button in the Ask sidebar. Worth noting because the agentic-browser surface is the new privacy battleground and the feature cadence is fast. OpenAI prepping GPT-5.1, the trade press through the week was setting expectations for a GPT-5.1 release synchronized with Google's anticipated Gemini 3 Pro launch. Both landed the following week. I'll cover the actuals next Sunday.

What this week tells me

Three things. The compute economy went multi-cloud and multi-silicon at the same time. OpenAI's $38B AWS deal, Google's Ironwood at GA with a million-unit Anthropic commitment, and the cumulative trillion-dollar OpenAI compute book are the same story told from three angles. The age of one frontier-lab on one cloud on one silicon vendor is over. The pattern is healthier than the alternative was. It also makes the stakes of any one infrastructure failure substantially higher.

Open-weights frontier models are real now. Kimi K2 Thinking is the news of the week and the news of the year for anyone whose AI architecture decisions assumed the closed labs were a permanent ceiling. The principled-practitioner answer to "should we build on hosted frontier models" should now include, by default, a benchmark of "what does this look like if we run K2 Thinking instead." The answer might still be "hosted" (that's fine) but it should be a comparison, not an assumption.

And the bubble talk is going to shape governance. Asia-Pacific markets sold off this week and the MIT no-revenue-gains study is everywhere. The political appetite for AI deregulation that defined the first half of 2025 is going to get harder to sustain when the equity markets start pricing in the gap between capex and productivity. SB 53 lands January 1. The EU AI Act's Digital Omnibus is being drafted. The window where "let the labs ship and we'll regulate later" was the consensus is closing. Next Sunday: GPT-5.1 actuals, Gemini 3 reception, the K2 Thinking second-week reviews, and whatever lands midweek.