DeepSeek V4 ships: 1M context, 7x cheaper than Claude Opus
V4 lands, Opus 4.8 lands the same week, and the Chinese lab pricing posture has clearly flipped from race-to-zero to value capture. Also: Qwen lead departure, ByteDance Seedance 2.0, and ModelBest's full-stack open release.
DeepSeek V4: hybrid attention, 1M context, Ascend-ready
V4-Pro is 1.6T total / 49B active with a 1M token window at $1.74 / $3.48 per million tokens. The interesting bit is the attention stack: CSA compresses every 4 tokens for sparse retrieval, HCA compresses every 128 tokens densely, and the two interleave. Result: 27% of V3.2's inference compute and 10% of its KV cache at 1M context. Add mHC residuals, the Muon optimizer, and explicit Ascend tuning with volume pricing landing H2 2026.
Open-weights frontier model that actually runs on Chinese silicon, at a price that breaks the Claude comparison.
Read at Recode China AI → link
Claude Opus 4.8 ships Dynamic Workflows: JS-orchestrated sub-agents
Opus 4.8 generates a JavaScript orchestration script that spawns hundreds of parallel sub-agents and stores intermediate state in script variables instead of the context window. Architecturally distinct from prior Claude Code sub-agent loops: orchestration lives in code, not in conversation. Demo: Bun's Zig to Rust migration, 750k lines, 11 days, 99.8% test pass rate via overnight runs. Code defect false-negative rate is a quarter of 4.7. Token bill is materially higher.
If sub-agents now coordinate via emitted code rather than context-window stitching, that is a real pattern shift for agent stacks.
Read at QbitAI → link
Chinese labs are now benchmarking against Anthropic, not OpenAI
Zhipu, MiniMax, Moonshot have all retargeted Claude as the reference. GLM-5.1 claims 58.4 on SWE-Bench Pro versus Opus 4.6 at 57.3. MiniMax M2.7 ran 100+ autonomous scaffold rounds for a self-reported 30% gain. Moonshot K2.6 revenue in 20 days beat its full 2025. Zhipu raised API prices 83% in Q1 and call volume still went up, which is the opposite of the 2024 race to zero.
Pricing power on Chinese open-weights is the leading indicator that agentic-coding revenue is real, not vaporware.
Read at Recode China AI → link
GLM-5 and Qwen 3.5 close the gap, DeepSeek sits out the cycle
GLM-5 is positioned against Claude Opus 4.5, Qwen 3.5 against Gemini 3.0. Lag behind US releases is now under three months. DeepSeek is conspicuously absent from this round, which fits the V4-delay reporting elsewhere in today's pile.
If you pick open-weights for production, this is the calibration read for which Chinese model to swap in next.
Read at Recode China AI → link
DeepSeek V4 preview: sparsity as the scaling lever
Companion piece to the V4 launch coverage. Thesis: with Nvidia access throttled, DeepSeek is treating sparse activation as the primary way to keep scaling intelligence per FLOP. MoE design choices and activation patterns frame the roadmap. Worth reading alongside the V4 specs.
Sparsity-first is the architectural story that explains both V4's price and its Ascend portability.
Read at Recode China AI → link
ByteDance ships Seedance 2.0 and Seed2.0
Seedance 2.0 for video generation, Seed2.0 as the foundation model. Framing puts ByteDance in the same tier as OpenAI and Google on capability step-change. For production video pipelines, the question is whether Seedance 2.0 actually beats Kling and the new Alibaba HappyHorse-1.0 in real workflows.
Direct input for your video-gen stack: another viable Chinese vendor with real distribution behind it.
Read at Recode China AI → link
Inside DeepSeek and Moonshot: org structure, departures, chip strategy
V4 missed Chinese New Year. A small-param test build went to compatibility partners in January, full release possibly April. R1 co-author Guo Daya and LLM co-author Wang Bingxuan left, the latter to Tencent. DeepSeek is still refusing multimodal and agent product pressure to stay focused on foundation research, no overtime policy intact. Moonshot runs flat, no KPIs, no titles, valuation at $18B on K2.5 traction.
Talent flows and org culture predict release cadence better than any roadmap deck.
Read at Recode China AI → link
Chinese chips quietly cross the training threshold
Baidu trained a key ERNIE 5.1 variant (800B MoE distilled from 2.4T ERNIE 5.0 at 6% pretraining cost) on Kunlunxin P800, 97% effective training rate across 10k-card clusters. Meituan's 1T LongCat 2.0 also trained domestically. GLM-5.1 and DeepSeek V4 adapted for domestic inference. Alibaba's Zhenwu M890: 144GB memory, 800GB/s inter-chip bandwidth, 560k units already shipped. Honest caveat: flagship scale yes, true frontier not yet.
The Nvidia dependency story is shifting faster than the model release cycle reveals.
Read at Recode China AI → link
DeepSeek, Moonshot, StepFun all close megarounds
DeepSeek raising up to $7B at $50B, its first outside money, with Liang Wenfeng personally in for ~$3B. Moonshot at $200M ARR, $2B raise at $20B led by Meituan. Kimi K2.6 is a trillion-parameter open-weights model supporting 300 concurrent sub-agents. StepFun closing ~$2.5B.
Capital flowing on agentic revenue, not vibes, which is what sustains the open-weights pipeline you depend on.
Read at Recode China AI → link
DeepSeek as the "Huawei of AI": strategic-scarcity premium
Translated Chinese analysis reads DeepSeek's $50B as a national-strategic bet, not a revenue multiple. V4's Ascend inference optimization gets called a historic chip-model coupling. The China IC Big Fund, normally a semiconductor-only investor, is reportedly interested in leading. Ding's caveat: training is still likely on Nvidia, Ascend wins are inference-side.
Frames why DeepSeek's valuation isn't comparable to Western lab rounds, and why V4's Ascend story matters politically.
Read at ChinAI Newsletter → link
Alibaba funds PixVerse and Vidu, then competes with both
PixVerse (100M users, $40M ARR, 16M MAU) and Vidu each pulled ~$300M, both with Alibaba money, both competing with Alibaba's own video models. Alibaba's internal ATH unit released HappyHorse-1.0 anonymously and topped Artificial Analysis on both text-to-video and image-to-video before claiming it. PixVerse R1 is a real-time interactive world model, roughly 2s latency at 1080p. Seedance 2.0 named as primary competitor.
For creative production: HappyHorse-1.0 is now a quality leader and worth a test pass.
Read at Recode China AI → link