AI in the News

AI in the news: week of November 16, 2025

OpenAI surprise-launches GPT-5.1 with a tone-picker. Anthropic reportedly commits $200B to Google Cloud. Moonshot's Kimi K2 Thinking, a trillion-parameter open-weight model that beats GPT-5. Microsoft Ignite warming up. My take on the week.

Sid Smith

16 Nov 2025 • 10 min read

What this week actually changed: three of the four big stories pull on the same axis at the same time, distributed-versus-concentrated. OpenAI surprise-shipped GPT-5.1 with a tone-picker, Anthropic reportedly committed $200B to Google for compute, and a trillion-parameter open-weights model out of China kept showing up at the top of agentic benchmarks. Microsoft Ignite teed up for the week after. A real news week, dressed up as a calm one.

GPT-5.1 ships, with a personality slider

Wednesday November 12. OpenAI released GPT-5.1 (Instant and Thinking variants) as a surprise drop with no DevDay-style stage event. The headline framing is "warmer, more conversational, smarter." The under-the-hood framing is adaptive reasoning: GPT-5.1 Instant decides on its own when to "think" before answering, and GPT-5.1 Thinking adapts its thinking time to the complexity of the request rather than running at a fixed budget. MacRumors covered the launch the same day; OpenAI also dropped GPT-5.1-Codex-Mini for the developer track.

The thing everyone is going to write about is the personality slider. GPT-5.1 ships with eight selectable tones. Default, Friendly, Efficient, Professional, Candid, and a few more. The pitch is that you can pick the voice you want to talk to. The framing in the launch posts emphasizes warmth and playfulness as defaults.

I have mixed feelings. The narrow-technical part of the release (adaptive reasoning, the codex-mini tier, the instruction-following improvements) is the right shape for a half-step refinement on top of GPT-5. That's the substance. The personality slider is product, not science, and it's the part I'm least sure is good for users. Here's the concern. "Warmer and more conversational" is also "stickier and more parasocial." The product surface that gets people to spend more time talking to a model is not always the surface that's actually serving them. The push toward emotional rapport with AI models is, in 2025, mostly an engagement-metric play. The same companies that learned to A/B-test social-feed dopamine hits in 2014 are now A/B-testing LLM warmth, and the underlying incentive shape is identical. I'd rather have an "Efficient" tone be the default. That it isn't tells you which way the product team is being pulled.

The Codex-Mini side of the release is more straightforwardly good. Cheaper agentic-coding tier, lower cost-per-token for the high-volume workflows where GPT-5 was overkill. That's a genuine "more capability, lower cost" move and I'll use it.

Anthropic's $200B Google compute deal

Late this week, The Information reported that Anthropic has committed to spending roughly $200 billion on Google Cloud and Google's TPU chips over a multi-year horizon. Engadget's coverage is the cleanest summary. Anthropic confirmed an expanded Google and Broadcom partnership including up to 1 million TPUs and well over a gigawatt of capacity coming online in 2026.

Numbers this size become abstractions. Let me try to make it concrete. $200 billion is roughly 6× Anthropic's most recent valuation round. It's the same order of magnitude as OpenAI's reported AWS commitment from earlier in November ($38B over seven years). It is, plainly, a bet that the only path forward for a frontier lab is unbounded compute access bought in advance from a hyperscaler.

Two takes worth holding at the same time. The obvious one: Anthropic needs the TPUs. Training Sonnet 4.5 / 5.x and the Opus refresh that's rumoured for late November requires a compute envelope that is not casually available on the market. Locking in TPU supply and Google Cloud capacity ahead of competitors is a defensible business move and arguably a necessary one if Anthropic wants to keep shipping at the cadence it has been. The less obvious one: this is a tightening of the hyperscaler-to-frontier-lab knot in a way that should worry anyone who cares about distributed AI. The frontier labs are now structurally locked to specific cloud providers' chip stacks for the next several years. Anthropic on Google TPUs. OpenAI on AWS GPUs (and Azure, and now CoreWeave). The "neutral compute marketplace" narrative, that you could in principle swap providers, that competition would discipline pricing, is dying in real time. We're heading into a multi-year period where the top three frontier labs are each fused to a different hyperscaler's silicon roadmap.

The downstream effect is on you, the user. When the agent-SDK or platform-API you've built against is locked to a model that's locked to a chip stack that's locked to a cloud, the vendor-lock-in shape is much worse than the 2010 cloud era. Your migration path is not "rewrite for a different cloud's APIs." It's "find another frontier model that has equivalent capabilities, port your prompts, your eval suite, your agent harness, your fine-tunes, your retrieval indexes (this is called RAG, if you want to look it up later), and accept whatever capability gap exists for the duration of the migration." That's a much harder unwind. The principled-user response is the same as it was a month ago: build the agent stack so the model is a swappable component. Use the local-model path where you can. Treat the frontier-lab API as a tool, not a foundation.

Kimi K2 Thinking, open-source, trillion parameters, competitive

The week's most under-covered story, in my view. On November 6-7, Moonshot AI released Kimi K2 Thinking, a one-trillion-parameter Mixture-of-Experts model with 32B active parameters, native INT4 quantization, 256K context, and explicit support for 200-300 sequential tool calls. Open weights on Hugging Face under a Modified MIT license. VentureBeat's coverage ran on Nov 7 and the model's resonance through the developer community continued through the following week, which is why I'm picking it up here.

The benchmarks are the part to slow down on. K2 Thinking scores 60.2% on BrowseComp. GPT-5: 54.9%. Claude Sonnet 4.5: 24.1%. On GPQA Diamond it edges GPT-5 (85.7% vs 84.5%). On HLE-with-tools the open model is competitive with the closed frontier. The training-cost number making the rounds is roughly $4.6M, which if remotely accurate is two-to-three orders of magnitude below what GPT-5 cost to train. SiliconAngle's coverage has the technical detail.

This is the story I want every CTO to be paying attention to. A Chinese lab released an open-weight model that beats the closed US frontier on the agentic-tool-use benchmark that actually matters for production agent workflows, at a fraction of the training cost, under a license that lets you self-host it. The distributed-AI thesis just got a major new piece of evidence in its favor.

The caveats are real and I'll name them. INT4 native is impressive but the inference economics still require serious hardware, this isn't a laptop model. The agentic-tool benchmarks are still early and the 200-300 sequential tool calls claim deserves independent verification. The geopolitical question of running a Chinese-trained model on sensitive workloads is one every enterprise has to answer for itself, and it's not a question with a clean answer. But the structural point stands: the open-weight frontier is no longer "interesting research models that lag the closed labs by 12-18 months." It's "models that match or beat the closed frontier on specific axes, today, that you can run on your own infrastructure." This changes the small-model conversation but more importantly it changes the on-prem AI conversation. The "you can't get frontier capability without sending data to a hosted API" assumption is breaking down faster than I expected even three months ago.

Microsoft Ignite warming up

Ignite kicks off Tuesday November 18 in San Francisco. The pre-event coverage this week (Microsoft's lifecycle-of-AI preview post dropped the morning of the keynote, but the week-of leaks and previews started Monday) suggests the headline themes will be Microsoft Foundry as a unified agent platform, Agent 365 as a control plane for governing agents across the enterprise, and a notable lean into Anthropic. Microsoft is reportedly bringing Claude into more first-party surfaces, breaking the OpenAI-exclusive framing the company has held to since 2023.

I'll have the actuals in next week's roundup. The frame I'm watching going in: the "Agent 365" play is the most interesting one, and the one I want to get right. The pitch is that enterprises will deploy lots of agents, from many vendors, and need a single control plane to observe, govern, and secure them. That's structurally correct as a problem statement. Whether Microsoft's specific implementation is the right answer, versus, say, an open standards-based approach where governance lives in the orchestration layer rather than a vendor's control plane, is the question I'll evaluate. The pattern to watch for is whether Agent 365 ends up being the place where governance becomes possible, or the place where governance becomes Microsoft-only. Those are very different outcomes. The governance-without-engineer-attrition argument I keep making requires that the governance layer be vendor-neutral. If Agent 365 is good (and I expect it will be on the technical merits) the question is whether it can be adopted without buying the rest of the Microsoft stack.

The labor-data update

Bloomberg and the Challenger layoffs report put concrete numbers on the AI-cited layoff narrative this week. November alone saw over 71,000 announced job cuts, with AI explicitly cited for over 6,000 of them. The 2025 running total of AI-cited cuts is at 55,000-plus, which is roughly 12× the 2023 number. The pattern firms in the data this week are IBM (2,000-3,000 cuts in Nov, AI agents replacing back-office roles), McKinsey (200 internal tech roles, AI automation cited), CrowdStrike (1% of global headcount), and UPS (an enormous 48,000 cuts under the "Network of the Future" initiative, AI-and-automation framed).

I want to be honest about where I am on this. The displacement is real and it's accelerating faster than I expected. I've spent my career on automation (IT systems, infrastructure, ops) and I've never had a problem with the work that should have been automated long ago getting automated. The IBM back-office cuts are mostly that shape. McKinsey automating internal non-client work is mostly that shape. I'm not going to pretend those roles weren't candidates for this exact transition.

The thing I keep coming back to is the pace. Short-term incentives are driving the rush. Companies aren't cutting at this speed because the AI is ready to absorb the work cleanly; they're cutting because the AI narrative is convenient and the markets reward the cuts. The pace is what's wrong, not the underlying reality that the workforce is changing. The sustainable shape is human+AI collaboration, and the companies that figure that out will outperform the ones optimizing purely for headcount reduction, but I want to be clear, the headcount will still shrink in the collaboration model. It just shrinks less and shrinks well. I'd rather be wrong about how fast this is moving than caught off guard by it. Plan for the realistic view; hope for slower.

Smaller items

A few worth a line. xAI's Grok 4.1 silent rollout. xAI has been A/B-testing Grok 4.1 against 4.0 on live traffic since November 1, with the formal release coming Monday November 17. The model-card-without-an-announcement approach is a pattern more labs should adopt, honestly. Less spectacle, more measurement. EU Digital Omnibus on AI, the Commission's proposed amendments to the AI Act were trailed this week ahead of the Nov 19 formal proposal. Headline change: high-risk-AI compliance deadline pushed to December 2027. This is regulatory softening dressed as administrative simplification, and the "we're delaying the rules to give industry time" framing is the same framing every safety-rule rollback uses. China's generative-AI national standards took effect November 1, three weeks before the EU softening proposal landed. The contrast is striking. China is moving faster on operational AI standards than the EU is on the rules the EU passed last year. The Anthropic Claude Opus refresh. Rumored for the week after this one (and confirmed Nov 24 in retrospect). Pre-launch chatter was loud this week. I'll cover the actuals next Sunday.

What this week tells me

Three things. The hyperscaler-frontier-lab fusion is now the structural reality. Anthropic-Google $200B is the largest single concrete instance of a pattern that's been building all year. The frontier labs are no longer "AI companies that buy compute from clouds." They're now strategically inseparable from specific hyperscalers' chip stacks. The competition story is no longer between models on quality, it's between vertically-integrated stacks on compute access. That's a different industry than the one we had in early 2024.

The open-weight frontier is real and the geography is shifting. Kimi K2 Thinking is the proof. The next twelve months will be defined, more than any other axis, by the question of whether open-weight models can hold pace with the closed frontier. If the answer is yes (and the evidence this week says it might be) the entire AI-platform economic story has to be rewritten. The bet that "you must use a hosted frontier model" is increasingly a bet on a specific business model, not on a technical necessity.

And OpenAI is becoming a consumer-product company. GPT-5.1's tone-picker is the tell. The center of gravity at OpenAI is shifting from "frontier model lab" to "consumer AI product company that also has APIs." Both can be true. The implication for developers is that the alignment between OpenAI's product roadmap and developer needs is going to drift further over time, which is a reason to not build your stack around the assumption that the API will keep prioritizing you.

The synthesis: the principled-AI-practitioner stance for late 2025 is "treat the frontier model as a swappable component, build your data and orchestration layer to be local-first or at minimum portable, and watch the open-weight frontier closely because the lock-in math could change fast." That's where I am. That's what this week reinforced. Next Sunday: Microsoft Ignite actuals, Claude Opus refresh (if Anthropic ships when rumored), Gemini 3 (rumored mid-to-late November), and whatever else lands midweek.