AI in the News

AI in the news: week of October 19, 2025

Anthropic ships Haiku 4.5, then drops Skills 24 hours later, two releases pulling on the distributed-AI argument. OpenAI-Broadcom go 10 GW custom silicon. NotebookLM gets real chat. The AI-layoff drumbeat keeps building. My take.

Sid Smith

19 Oct 2025 • 9 min read

What this week actually changed: Anthropic spent 24 hours making the case for distributed AI. Haiku 4.5 on Wednesday, Skills on Thursday, while OpenAI signed another 10 GW of compute on the opposite end of the same week. The cumulative shape: small-model performance is now genuinely good enough to push real work off the frontier tier, and the platform-primitive race is heating up faster than the model-quality race.

Haiku 4.5: small got real

October 15. Anthropic released Claude Haiku 4.5, the small/fast tier in the Claude family, priced at $1 per million input tokens and $5 per million output, with up to 90% prompt-caching discount and 50% batch discount on top. The pitch is "Sonnet 4 coding performance at a third the cost and twice the speed." Available on Claude.ai, the API, Bedrock, Vertex, and Microsoft Foundry from launch day, and in public preview on GitHub Copilot the same day.

Two things make this more interesting than a typical small-model bump. First, the system card shows Haiku 4.5 with a statistically significantly lower rate of misaligned behaviors than both Sonnet 4.5 and Opus 4.1 in Anthropic's automated alignment evals. It shipped under ASL-2 rather than ASL-3, which reads, on the safety side, as "small enough not to need the higher containment, but capable enough to actually do real work." That's the right shape of a small-model release. The capability ceiling that gates the higher-risk classification has not moved down to the Haiku tier yet, and the model is genuinely useful at the lower tier.

Second, the orchestration framing in Anthropic's launch post is plain: Sonnet 4.5 plans, multiple Haiku 4.5s execute subtasks in parallel. That's the agent architecture they're pitching, a planner-and-workers split where the expensive model does the strategic work and a fleet of cheap models does the throughput. It's been the right architecture for two years. What's new is that the worker tier is now capable enough to do non-trivial coding and tool-use without escalating constantly.

This is the release I'll be using. I've written before about how the small-model tier is where most production work should happen, and Haiku 4.5 is a clean step in that direction. At $1/$5 per million tokens you can run real workloads against it without the cost discipline conversation dominating every architectural decision. Two years ago a model this capable cost ten times this, and the workflows that were too expensive then are back on the table.

The principled-AI angle is the distributed-versus-concentrated frame. A capable cheap model is a distributed-friendly model. Keeping planning at the frontier tier and pushing execution to a swarm of small models means the workload distribution maps cleanly to a workload-sensitivity distribution, sensitive context stays in the smallest possible blast radius, throughput happens against the cheapest possible foundation. Haiku 4.5 makes that pattern materially more practical than it was last week.

Skills, the platform-primitive move

October 16, the day after Haiku. Anthropic introduced Claude Skills, a way to package a folder of instructions, scripts, and reference material into a reusable, model-discoverable capability. Each Skill is a Markdown file with optional supporting code that Claude loads when the conversation context calls for it. Available across Claude.ai, the API, and Claude Code from launch.

Simon Willison's take that landed the same day, Claude Skills are awesome, maybe a bigger deal than MCP, is the right starting point. Skills are smaller and more focused than MCP servers (this is called MCP, the Model Context Protocol, if you want to look it up later); the model loads them on demand rather than holding them in context the whole time; the discovery-and-determinism pattern means the model decides when to invoke them but the Skill itself can run deterministic code to produce predictable output. It's a clean primitive.

The strategic frame matters. Skills, the Agent SDK from two weeks ago, and the Apps SDK that OpenAI shipped at DevDay are all moves in the same game: build the platform layer that the agent tooling runs on. The lab that owns the primitive owns the developer mindshare. Anthropic's bet is that Skills become a portable artifact, a Skill should run against any model that supports the protocol, not just Claude.

The honest take is that I'm cautiously into this. Skills sit at the right level of abstraction for the kinds of workflows I actually want to encode. A folder of Markdown plus a couple of scripts is a much more inspectable and version-controllable artifact than a fine-tune or a prompt template hidden inside a vendor product. The audit trail story is good. If I'm encoding a governance-relevant workflow, I'd rather have it in a Git-tracked Skill folder than in a hosted vendor's prompt-management UI.

The push-back is the same one I had on the Agent SDK. Skills work against hosted Claude today and the design assumes that. Making Skills run against a local model endpoint is a tooling question, not a fundamental impossibility. The open-standard direction Anthropic is signaling makes me cautiously optimistic that this will land in a state where the foundation is genuinely interchangeable. The question I'll watch is whether the Skill-runtime conventions get formalized in a model-agnostic way or stay coupled to Claude's specific tool-use protocol.

The 24-hour Haiku-then-Skills cadence is worth naming. That's a "the platform is the product" week. The labs that win the next two years are going to be the ones whose primitives the developer community adopts as the default way to build with AI, and Anthropic is making the strongest play on that surface right now.

OpenAI and Broadcom go custom silicon

October 13. OpenAI and Broadcom announced a strategic collaboration to deploy 10 gigawatts of OpenAI-designed AI accelerators, with Broadcom on the systems and networking side. OpenAI designs the chips and racks, Broadcom builds and helps deploy them. Ethernet is the fabric. Term sheet signed; deployment over the back half of 2026 and into 2027.

This is the third major OpenAI compute announcement in a month. Nvidia 10 GW in late September, AMD 6 GW with the warrant structure in early October, now Broadcom 10 GW custom silicon. The cumulative compute bet is starting to look more like an industrial program than a vendor-procurement strategy. The stack is being diversified across silicon vendors and across architectures, and OpenAI is building chip-design IP in-house as part of the deal.

The reading I'd push: the marginal cost of frontier compute is the bottleneck, and the labs that solve it most cheaply will set the price floor for the rest of the market. The OpenAI play is to vertically integrate enough of the silicon stack to extract the margin NVIDIA has been collecting. Whether OpenAI specifically can pull this off is a separate question, chip programs are hard, the timelines are long, the failure mode is "you spent two years and didn't ship anything competitive." But the direction matters even if this specific bet underperforms.

What this changes for the principled-AI question: not much directly. The compute-supply story is upstream of the workload-architecture conversation. Cheaper inference at the frontier is good for everyone who isn't a frontier model vendor. It puts more downward pressure on the cost of the small/cheap tier, which is where I want most of the work to happen anyway. The indirect effect is the one to watch: as the hyperscaler-and-lab capex commitments compound, the political weight of the AI build-out becomes harder to push back against, and that has governance consequences.

NotebookLM gets a real chat upgrade

Google pushed a substantial NotebookLM update in mid-October, centered on a chat overhaul: saved chat history, custom goals and personas per notebook, an 8x larger context window for chat, 6x longer conversation memory, and a stated 50% boost in response quality. The 1M-token context window is the headline number people are quoting.

NotebookLM is the Google AI product I actually use. It's the right shape, your sources go in, the model works against them, the affordance is research rather than chat. The upgrades this month are real quality-of-life wins. Saved chat history was a missing piece; per-notebook personas means I can configure one notebook for technical-writing critique and another for source-summarization without redoing the prompting every time.

The governance angle: it's still a hosted Google product, the sources you upload still go to Google, and the data-handling story is whatever Google says it is on any given Tuesday. For my own use, NotebookLM gets the public-facing research material and not the client-confidential or PII-bearing material. That split is the right one to maintain even as the product gets more capable. The temptation when a hosted tool gets good is to stop being disciplined about what goes into it. The discipline is the work.

The labor story keeps building

The October layoff numbers are starting to come in and they're heavy. Amazon announced 14,000 corporate job cuts (its largest round ever) explicitly framed around freeing capital for AI investment. Chegg cut 45% of its workforce citing the "new realities of AI" and the Google traffic decline. Smaller cuts at Google Cloud's design and UX teams in service of redirecting headcount to AI engineering. By the IEEE's tally, 184,000 global tech layoffs through October 2025, with about 27% directly tied to AI replacing workers.

I want to keep my read on this calibrated. The displacement is real and it's accelerating faster than I expected. I've spent my career on systems automation, and I'm fine with AI doing the IT and operations work that should have been automated long ago, that's been my entire trajectory. The Chegg case is partly a different shape. Chegg's revenue collapsed because Google's AI Overviews ate the top-of-funnel traffic the product depended on, which is "displaced by Google's AI" rather than "we replaced our staff with our own AI", but it still nets out to people losing their jobs because of AI. The framing matters less than the outcome, and the outcome is real.

What I keep coming back to is the pace, and the driver behind it. Short-term market dynamics are real and they shape a lot of this. AI is in a rush phase, driven by short-term incentives. Companies aren't cutting at this speed because the AI workflows are mature; they're cutting because the AI narrative is convenient and the markets reward the cuts. Amazon's 14,000 isn't being announced because the AI agents have been load-tested against 14,000 jobs of work. It's being announced because freeing capital for AI investment is the story analysts want to hear right now.

The sustainable shape is human+AI collaboration. Companies that figure out the collaboration outperform companies that just cut. To be clear: the headcount still shrinks under collaboration. It just shrinks slower, more deliberately, and the gap-filling pain, which the cut-fast firms are about to discover when the surviving headcount inherits two-and-a-half jobs each, is smaller. The 2027 correction cost is lower under collaboration than under racing-to-cut. That's the reason to do it well rather than fast. I'd rather be wrong about how fast this moves than be caught off guard.

Quieter items

A few smaller pieces worth a line. Google Gemini 2.5 Computer Use shipped to developers via the Gemini API earlier in October, built on Gemini 2.5 Pro, designed for agents that interact with UIs. The computer-use surface is becoming a standard offering across the labs. Atlassian's Service Collection continued rolling out following the early-October launch. JSM, CSM, Assets, and Rovo agents bundled, with Virtual Service Agent included in Premium and Enterprise. Atlassian also closed out the acquisition of DX (announced Sep 18, working through close), the engineering-intelligence play. The "AI for measuring AI's effect on engineering teams" framing is going to become more prominent as the layoffs narrative collides with actual productivity data. And California's algorithmic-discrimination employment regulations took effect October 1, with the four-year automated-decision-data retention requirement now in force. Quiet but consequential. If you're using AI in hiring or workplace decisions in California, the audit-trail requirement is real now.

What this week tells me

Three things. The small-model tier just got materially more capable. Haiku 4.5 at $1/$5 with Sonnet-4-class coding performance changes the cost arithmetic for the workloads that should have always been running on smaller models. Every "we run everything against the frontier model because we don't want to think about model selection" architecture got a little less defensible this week. The distributed pattern, small workers, large planners, sensitive context kept in the smallest blast radius, is where the work should be going.

Anthropic is winning the platform-primitive game right now. Skills, the Agent SDK, the orchestration framing in the Haiku launch, they're moving faster than OpenAI on the developer-foundation conversation. OpenAI's response is the Apps SDK and the AgentKit stack from DevDay; the next two months will tell us whether the Anthropic primitives or the OpenAI ones become the tooling default. I'm rooting for the version where the primitives are open enough to run against any model.

And the labor displacement is real and the pace is the problem. Amazon's 14,000 round is the largest single AI-framed cut yet, and Chegg's 45% shows the framing is now load-bearing for executives explaining bad outcomes. The displacement itself I expected. The speed I didn't. Short-term incentives are the driver. The firms that build human+AI collaboration instead of racing to cut will still shrink, just slower, better, and with fewer 2027 surprises. Next Sunday: the late-October Anthropic-Google TPU deal that's been telegraphed, more Skills uptake signal, whatever the layoff drumbeat sounds like in the run-up to month-end earnings.