Kir-News.
A daily scan of Chinese AI: open-weight labs, production agents, training economics, and the policy and chip stories behind them. Scored every morning, summarised in plain English, published here at 07:00 Stockholm.
-
DeepSeek released V4-Pro today, a 1.6 trillion parameter open-weights model with a one million token context window, priced at roughly one-seventh of Anthropic's Claude Opus. The model uses a new hybrid attention design that cuts compute to a quarter of the previous version, and it has been tuned to run on Huawei's Ascend chips, which matters because US chip restrictions still cap what Chinese labs can buy from Nvidia. Meanwhile Anthropic shipped Claude Opus 4.8 with a feature that orchestrates hundreds of parallel sub-agents via generated JavaScript, and Chinese labs Zhipu, MiniMax, and Moonshot are now openly benchmarking themselves against Claude rather than OpenAI. The throughline: Chinese frontier labs have stopped chasing GPT and started chasing agentic coding revenue, with pricing power to match.
Lead story: DeepSeek V4 ships: 1M context, 7x cheaper than Claude OpusRead the full briefing → -
V4 ripple effects, one week in. The cost floor is sinking faster than the labs can absorb it. • One-week telemetry: V4 cache savings hold at scale across Tencent QClaw, Alibaba ACS and ByteDance Volcano • Cambricon Siyuan 690 leaks — a V4-tuned inference chip at ~70% of Hopper throughput, with wider availability than Huawei Ascend • Anthropic responds: Claude Sonnet 4.7 ships with the reliability fixes from the bug saga and a 35% cache-token price cut • Tsinghua / Shanghai AI Lab paper: verifier-swap RL hits 7× compute efficiency on R1-class results
Lead story: DeepSeek V4 forced Anthropic to cut prices for the first time.Read the full briefing → -
DeepSeek V4 is dramatically cheaper to run — but it cost them. • Inference uses 25% of the compute, 10% of the memory of V3 • A new "Engram" trick crushes long-context RAM from 80GB to 8GB • Tuned for Huawei and Cambricon chips, not Nvidia • The trade: Huawei migration delayed launch by months, multimodal got cut, senior staff walked to ByteDance, Tencent, Xiaomi
Lead story: DeepSeek V4 runs on a quarter of the compute. The Huawei pivot cost them senior staff.Read the full briefing → -
Agent coding just got 83% cheaper. And Claude shipped two months of bugs. • DeepSeek's V4 cache pricing cut wiped 83% off agent coding bills — almost all tokens hit the cache • Tencent's QClaw agent platform already wired V4 in alongside Hunyuan 3 • Anthropic confirmed three bugs that quietly degraded Claude Code for two months • Combined: the cost floor for agent workloads in China keeps sinking, and Western incumbents are losing the reliability argument too
Lead story: Agent coding bills just dropped 83%. Claude shipped two months of bugs.Read the full briefing → -
Product photography just hit an inflection point. The advantage isn't the model — it's the pipeline. • Frontier engines stopped failing on hands, fabric, edges: Kling and Seedance for video; GPT Image 2.0, Nano Banana, Flux Kontext for stills • None of them on its own delivers a coherent catalog. Composition is the moat. • We build per-client pipelines in Weavy — best model per asset type, brand consistency baked in, a Design App the client's team runs themselves • Receipts: Shein runs AI models in its catalog; ASOS cut production costs 23% in 2024; a four-person team made a $700 spot that hit 5B views
Lead story: Fashion shoots are getting 90% cheaper. Most of the tools come from China.Read the full briefing → -
DeepSeek's V4 technical report is the most consequential release of the quarter. • 1.6 trillion parameters total — only 49B active per query (sparse MoE) • 1M token context, on a fraction of the previous memory footprint • Swapped AdamW for the Muon optimizer; new mHC residuals; on-policy distillation replaces RL post-training • Coding benchmarks reportedly beat GPT-5.4 and Claude Sonnet 4.5 • Huawei Cloud shipped same-day Ascend support; PPIO is serving the API
Lead story: DeepSeek V4: 1.6 trillion params, 1M context — reportedly beats Claude on codeRead the full briefing → -
Chinese labs stopped chasing OpenAI. The economics are starting to work. • DeepSeek V4 ships open weights — same-day Hygon DCU support, explicit Huawei chip tuning • Zhipu, MiniMax, Moonshot now benchmark against Claude. Zhipu grew API revenue 60× while raising prices 83% • New inference-only GPU unicorn Xiwang pitches a chip aimed at cutting per-token serving costs 90% • Throughline: domestic silicon plus agent-shaped APIs is the consolidation play
Lead story: Chinese labs stopped chasing OpenAI. Their new benchmark: Claude.Read the full briefing → -
China shipped four frontier open models in a single month — and the economics are starting to break. • Moonshot's Kimi K2 Thinking: 1T params, trained for $4.6M, chains 300 tool calls, beats GPT-5 on browsing • Zhipu's GLM-5 + Alibaba's Qwen3.5 collapse long-context inference cost — Qwen3.5 is priced at 1/18th of Gemini 3 • ByteDance's Seedance 2.0 quietly rewrote AI-video economics: a four-person team made a viral $700 spot that hit 5B views
Lead story: Trained for $4.6M. Chains 300 tool calls. Beats GPT-5 on browsing.Read the full briefing →