AI in the News

AI in the news: week of February 1, 2026

Post-Davos week. The China open-weights flywheel is the most consistent shipping story in AI right now. Kimi K2.5, Qwen wired into Taobao, Tencent dropping cash for Lunar New Year. Plus Yahoo Scout on Claude, three different bets on agentic commerce, and the AI-cited layoff playbook hardening.

Sid Smith

01 Feb 2026 • 7 min read

What this week actually changed: the China open-weights flywheel is now the most consistent shipping story in AI, and the agentic-commerce stack is being built three different ways at once, and only one of them lets you pick your own assistant.

The week after Davos. The CEO statements are off the wire and the labs are back to shipping. Quieter on the US frontier-lab side. Heavier on China, on the agentic-commerce build-out, and on the labor story hardening into a playbook.

The China open-weights flywheel keeps turning

Tuesday, January 27, Beijing-based Moonshot AI released Kimi K2.5, the next iteration of its open-weights frontier model. Mixture-of-experts, 1 trillion total parameters, 32B active per request, trained on 15T multimodal tokens with vision and language co-developed instead of bolted on. The headline feature is "Agent Swarm", coordinated execution across up to 100 specialized agents in a single workload. Moonshot is claiming benchmark wins over the leading US closed models on agentic coding and long-horizon planning. Weights are up on Hugging Face and the model is already available via Bedrock.

A few things worth taking seriously. The benchmark claims need to settle, first-week numbers from any lab tend to look better than month-two reality, and Chinese-lab benchmarks have historically had more variance than the US labs do. The agent-swarm framing is the more interesting move. Most of the agent work shipped in 2025 assumed a single primary model orchestrating tools; Kimi K2.5's pitch is that the model itself coordinates a fleet. That's a different bet, and it's one you only make from a position of "compute is cheap enough that we can spend it on coordination overhead."

The bigger story is the cumulative one. One year ago this week, DeepSeek-R1 dropped and reset the cost curve. The follow-on hasn't been a single model to outdo it, it's been a year of Chinese labs (Alibaba, Moonshot, DeepSeek itself, Zhipu, MiniMax) shipping at a cadence the US frontier labs aren't matching on the open-weights side. The open-weights frontier hasn't caught the closed frontier. It's closed the gap to the point where, for most workloads that don't need the absolute frontier, the open-weights option is the obvious one. That's the part of the DeepSeek story I think most people still haven't internalized.

Tencent piled on the same week with a 1 billion yuan ($140M) cash giveaway distributed through its Yuanbao AI chatbot for Lunar New Year. The pattern mimics the WeChat red-envelope campaign that helped WeChat take over mobile payments a decade ago. Distribution-via-incentive is a real growth lever in the Chinese consumer-AI market. The Western labs aren't running anything comparable, partly because the regulatory exposure would be different, partly because the US consumer-AI distribution game runs through app stores and platform integrations instead.

Three different bets on agentic commerce

Same Tuesday as Kimi, Alibaba's Qwen mobile app update wired the chatbot directly into the company's commerce ecosystem, starting with food-delivery via Ele.me and rolling out to Taobao and Fliggy. The pitch is "assign errands to your assistant" (phone calls, document processing, restaurant ordering, travel booking) with the actions executing inside Alibaba's own platforms.

Hold that next to the Mastercard / OpenAI / Microsoft Agent Pay announcements from the previous week, plus Google's Universal Commerce Protocol (UCP) announcement at the National Retail Federation event on Tuesday. Three different bets on the same shape. The Mastercard-led Western bet is that the AI assistant lives outside the commerce platform and the commerce platform exposes payment-and-checkout primitives the assistant can call. Google's bet is to set the protocol layer with UCP and let any assistant talk to any retailer through it. The Alibaba bet is that the assistant lives inside the commerce platform and the platform owns the whole flow end-to-end.

All three work for the platform that wins. Only the first two work for the user who wants to pick their assistant independently of where they shop. The Western model preserves my ability to use Claude or a local model to drive my Mastercard checkout. The Alibaba model couples the assistant choice to the Alibaba platform choice. I'm not predicting one model wins outright (they probably co-exist in their respective markets) but the structural difference is the kind of thing that's invisible until you try to swap one component out and find you can't. Whether UCP becomes a real standard or just Google's preferred way of doing it depends on whether the other big retailers and payment networks pick it up. Worth watching.

Yahoo Scout, or what a mid-tier consumer brand looks like in 2026

Friday, January 30, Yahoo launched Scout in US beta, an AI answer engine pitched at shopping comparisons, stock analysis, weather planning, and fact-checking. The interesting detail isn't the product. The answer-engine category is crowded now. The interesting detail is the architecture: Scout uses Claude as its primary model and Bing's grounding API for citation-backed search. Yahoo brings hundreds of millions of user profiles and a large entity knowledge graph. Three vendors stitched into one product surface. None of them is Yahoo's own.

This is what a mid-tier consumer brand looks like when they're building AI into their product without trying to compete on model R&D. License the best model. License the best search-grounding API. Bring your own data and your own brand. The build-vs-buy line keeps moving toward "buy the model, ship the product." For most consumer surfaces. It's the right call. Building a frontier model to power a search experience is a billion-dollar table-stakes investment that doesn't differentiate the product. The differentiation is in the data and the surface.

The Anthropic positioning here is worth flagging. Claude as the brain inside Yahoo's product is the same play Anthropic ran with Amazon Q and the Bedrock-distributed deployments, be the model behind other people's products rather than fighting the consumer-app battle directly. Quieter strategy than OpenAI's ChatGPT-first approach, and it's been working well enough that Anthropic closed a $30B funding round at a $380B valuation earlier this month. The "do more with less" pitch is doing real work in the market.

Apple, Gemini, and the question to ask when the announcement actually lands

Bloomberg reported Friday that Apple is preparing to unveil a significantly updated Siri using Google's Gemini as the backing model, with the new Siri expected to debut in an iOS 26.4 beta shortly after the February announcement and a broader release in March or early April. Short take because the announcement isn't actually the announcement yet, but if the reporting is right, this is the biggest platform move of the quarter.

Apple has spent two years trying to make on-device Apple Intelligence the brain behind Siri, has shipped real on-device capability (MLX matured into a real training and inference stack), and is apparently still ceding the conversational layer to a partner. The on-device story matters and stays (that's where the privacy floor lives) but the conversational brain being Gemini means the user experience is going to feel like Google's product wearing Apple's chrome.

The question I want answered when this lands: does the Gemini call happen on-device, in Apple's Private Cloud Compute, or in Google's cloud? If it's the third one, the privacy story Apple has been telling for two years has a hole in it that's going to get found and prodded immediately. The framing only works if the data routing matches the framing, and "Gemini behind Siri" doesn't tell you which architecture is in play. Worth waiting for the actual announcement before reading too much into it. The question to ask when it lands is the data-routing one.

The AI-cited layoff playbook hardens

Pinterest's 15% cut and Amazon's expanded 16,000-role elimination both landed this week with explicit "AI-forward strategy" framing (CBS News, InformationWeek). Q1 tech layoffs are at ~78,000 already with trackers attributing about 48% to AI. The "AI-forward" label converts a cost cut into a strategic-pivot story that gets a better multiple, and that's what's setting the pace. Pinterest's language is going to show up in every layoff announcement memo for the rest of the quarter. The structural driver is that the markets reward the AI-narrative cut more than the equivalent cost-cutting cut. I'd rather be wrong about how fast this moves than be caught flat-footed. The longer position is in the job-security piece.

A few smaller items worth flagging

Google Personal Intelligence in Gemini launched this week. The assistant connects across Gmail, Photos, Search history, and YouTube to answer with personal context. Permission-gated. The capability is impressive and the privacy framing is the same one Google has used for years, "your data stays in your account, used only for your benefit, not for ad targeting." The framing is consistent. Whether you trust it at face value is a separate question.
DeepMind Project Genie went GA for US Google AI Ultra subscribers, interactive AI world-building. The demos are striking. The product question is whether interactive-world-generation is a feature or a market, and the answer is probably "feature for now, market in 18 months."

What to watch next week

The pattern that keeps holding week to week: when the news is heavy on US-frontier-lab releases, the conversation goes to capability and benchmarks. When the news is heavy on Chinese open-weights and corporate-deployment stories, the conversation goes to deployment patterns and labor effects. The deployment-and-labor weeks are the ones that actually move the world. Capability wins are reversible. Layoffs are not.

The strategic question lurking under the China cadence isn't whether open-weights frontier capability gets to China, it's whether the US ends up importing it from China rather than building its own. It's the question the chip-export debate is downstream of, and the one that's hardest to talk about cleanly.

Next Sunday: the February model-release calendar is heavy. Mistral, possibly DeepSeek-V4, the rumored Anthropic mid-quarter release. Plus whatever Apple actually announces on Siri. And whatever the EU does with the Digital Omnibus.