Kir-News.
A daily scan of Chinese AI: open-weight labs, production agents, training economics, and the policy and chip stories behind them. Scored every morning, summarised in plain English, published here at 07:00 Stockholm.
-
DeepSeek released V4 today, and the technical report is unusually detailed. The headline: a 1.6 trillion parameter model (with only 49 billion active per query, thanks to a sparse "mixture-of-experts" design) that handles a 1 million token context window while using a fraction of the memory of its predecessor. They swapped the standard AdamW optimizer for Muon, introduced a new residual scheme called mHC to stabilize very deep networks, and replaced reinforcement learning post-training with on-policy distillation from specialist teacher models. Coding benchmarks reportedly beat GPT-5.4 and Claude Sonnet 4.5. Huawei Cloud shipped same-day Ascend support, and PPIO is serving the API. For the Chinese open-weights ecosystem, this is the most consequential release of the quarter.
Lead story: DeepSeek V4 lands: 1M context, mHC residuals, MuonRead the full briefing → -
DeepSeek released V4 today as open weights, with explicit Huawei chip support and same-day adaptation from Hygon's DCU accelerators. Separately, Chinese frontier labs (Zhipu, MiniMax, Moonshot) have quietly dropped OpenAI as their benchmark target and are now chasing Anthropic's Claude on agentic coding, with real revenue to show for it: Zhipu's API business grew 60x year-over-year and raised prices 83% without losing volume. A new inference-only GPU unicorn, Xiwang, is pitching a chip aimed at cutting per-token serving costs by 90%. The throughline: the Chinese stack is consolidating around domestic silicon plus agent-shaped APIs, and the economics are starting to work.
Lead story: DeepSeek V4 lands, Hygon adapts Day-0, Huawei pairing confirmedRead the full briefing → -
Chinese AI labs shipped a wave of frontier open models this month. Moonshot's Kimi K2 Thinking (a trillion-parameter model that cost only about $4.6M to train) can chain 300 tool calls autonomously, beating GPT-5 on some web-browsing benchmarks. GLM-5 from Zhipu and Qwen3.5 from Alibaba debuted architectures that make long-context inference dramatically cheaper — Qwen3.5 is priced at roughly 1/18th of Google's Gemini 3. Underneath it all, ByteDance's Seedance 2.0 is quietly rewriting the economics of AI video, with small teams producing viral 90-second spots for under $700.
Lead story: Kimi K2 Thinking: 1T params, $4.6M, 300 tool callsRead the full briefing →