DeepSeek V4 ships: 1M context, 7x cheaper than Claude Opus

27 articles scored today

In plain English

DeepSeek released V4-Pro today, a 1.6 trillion parameter open-weights model with a one million token context window, priced at roughly one-seventh of Anthropic's Claude Opus. The model uses a new hybrid attention design that cuts compute to a quarter of the previous version, and it has been tuned to run on Huawei's Ascend chips, which matters because US chip restrictions still cap what Chinese labs can buy from Nvidia. Meanwhile Anthropic shipped Claude Opus 4.8 with a feature that orchestrates hundreds of parallel sub-agents via generated JavaScript, and Chinese labs Zhipu, MiniMax, and Moonshot are now openly benchmarking themselves against Claude rather than OpenAI. The throughline: Chinese frontier labs have stopped chasing GPT and started chasing agentic coding revenue, with pricing power to match.

V4 lands, Opus 4.8 lands the same week, and the Chinese lab pricing posture has clearly flipped from race-to-zero to value capture. Also: Qwen lead departure, ByteDance Seedance 2.0, and ModelBest's full-stack open release.

DeepSeek V4: hybrid attention, 1M context, Ascend-ready

Recode China AI · EN · sebmeter 92

V4-Pro is 1.6T total / 49B active with a 1M token window at $1.74 / $3.48 per million tokens. The interesting bit is the attention stack: CSA compresses every 4 tokens for sparse retrieval, HCA compresses every 128 tokens densely, and the two interleave. Result: 27% of V3.2's inference compute and 10% of its KV cache at 1M context. Add mHC residuals, the Muon optimizer, and explicit Ascend tuning with volume pricing landing H2 2026.

Open-weights frontier model that actually runs on Chinese silicon, at a price that breaks the Claude comparison.

Read at Recode China AI → link

Claude Opus 4.8 ships Dynamic Workflows: JS-orchestrated sub-agents

QbitAI · ZH · sebmeter 91

Opus 4.8 generates a JavaScript orchestration script that spawns hundreds of parallel sub-agents and stores intermediate state in script variables instead of the context window. Architecturally distinct from prior Claude Code sub-agent loops: orchestration lives in code, not in conversation. Demo: Bun's Zig to Rust migration, 750k lines, 11 days, 99.8% test pass rate via overnight runs. Code defect false-negative rate is a quarter of 4.7. Token bill is materially higher.

If sub-agents now coordinate via emitted code rather than context-window stitching, that is a real pattern shift for agent stacks.

Read at QbitAI → link

Chinese labs are now benchmarking against Anthropic, not OpenAI

Recode China AI · EN · sebmeter 85

Zhipu, MiniMax, Moonshot have all retargeted Claude as the reference. GLM-5.1 claims 58.4 on SWE-Bench Pro versus Opus 4.6 at 57.3. MiniMax M2.7 ran 100+ autonomous scaffold rounds for a self-reported 30% gain. Moonshot K2.6 revenue in 20 days beat its full 2025. Zhipu raised API prices 83% in Q1 and call volume still went up, which is the opposite of the 2024 race to zero.

Pricing power on Chinese open-weights is the leading indicator that agentic-coding revenue is real, not vaporware.

Read at Recode China AI → link

GLM-5 and Qwen 3.5 close the gap, DeepSeek sits out the cycle

Recode China AI · EN · sebmeter 85

GLM-5 is positioned against Claude Opus 4.5, Qwen 3.5 against Gemini 3.0. Lag behind US releases is now under three months. DeepSeek is conspicuously absent from this round, which fits the V4-delay reporting elsewhere in today's pile.

If you pick open-weights for production, this is the calibration read for which Chinese model to swap in next.

Read at Recode China AI → link

DeepSeek V4 preview: sparsity as the scaling lever

Recode China AI · EN · sebmeter 82

Companion piece to the V4 launch coverage. Thesis: with Nvidia access throttled, DeepSeek is treating sparse activation as the primary way to keep scaling intelligence per FLOP. MoE design choices and activation patterns frame the roadmap. Worth reading alongside the V4 specs.

Sparsity-first is the architectural story that explains both V4's price and its Ascend portability.

Read at Recode China AI → link

ByteDance ships Seedance 2.0 and Seed2.0

Recode China AI · EN · sebmeter 80

Seedance 2.0 for video generation, Seed2.0 as the foundation model. Framing puts ByteDance in the same tier as OpenAI and Google on capability step-change. For production video pipelines, the question is whether Seedance 2.0 actually beats Kling and the new Alibaba HappyHorse-1.0 in real workflows.

Direct input for your video-gen stack: another viable Chinese vendor with real distribution behind it.

Read at Recode China AI → link

Inside DeepSeek and Moonshot: org structure, departures, chip strategy

Recode China AI · EN · sebmeter 78

V4 missed Chinese New Year. A small-param test build went to compatibility partners in January, full release possibly April. R1 co-author Guo Daya and LLM co-author Wang Bingxuan left, the latter to Tencent. DeepSeek is still refusing multimodal and agent product pressure to stay focused on foundation research, no overtime policy intact. Moonshot runs flat, no KPIs, no titles, valuation at $18B on K2.5 traction.

Talent flows and org culture predict release cadence better than any roadmap deck.

Read at Recode China AI → link

Chinese chips quietly cross the training threshold

Recode China AI · EN · sebmeter 78

Baidu trained a key ERNIE 5.1 variant (800B MoE distilled from 2.4T ERNIE 5.0 at 6% pretraining cost) on Kunlunxin P800, 97% effective training rate across 10k-card clusters. Meituan's 1T LongCat 2.0 also trained domestically. GLM-5.1 and DeepSeek V4 adapted for domestic inference. Alibaba's Zhenwu M890: 144GB memory, 800GB/s inter-chip bandwidth, 560k units already shipped. Honest caveat: flagship scale yes, true frontier not yet.

The Nvidia dependency story is shifting faster than the model release cycle reveals.

Read at Recode China AI → link

DeepSeek, Moonshot, StepFun all close megarounds

Recode China AI · EN · sebmeter 72

DeepSeek raising up to $7B at $50B, its first outside money, with Liang Wenfeng personally in for ~$3B. Moonshot at $200M ARR, $2B raise at $20B led by Meituan. Kimi K2.6 is a trillion-parameter open-weights model supporting 300 concurrent sub-agents. StepFun closing ~$2.5B.

Capital flowing on agentic revenue, not vibes, which is what sustains the open-weights pipeline you depend on.

Read at Recode China AI → link

DeepSeek as the "Huawei of AI": strategic-scarcity premium

ChinAI Newsletter · EN · sebmeter 72

Translated Chinese analysis reads DeepSeek's $50B as a national-strategic bet, not a revenue multiple. V4's Ascend inference optimization gets called a historic chip-model coupling. The China IC Big Fund, normally a semiconductor-only investor, is reportedly interested in leading. Ding's caveat: training is still likely on Nvidia, Ascend wins are inference-side.

Frames why DeepSeek's valuation isn't comparable to Western lab rounds, and why V4's Ascend story matters politically.

Read at ChinAI Newsletter → link

Alibaba funds PixVerse and Vidu, then competes with both

Recode China AI · EN · sebmeter 72

PixVerse (100M users, $40M ARR, 16M MAU) and Vidu each pulled ~$300M, both with Alibaba money, both competing with Alibaba's own video models. Alibaba's internal ATH unit released HappyHorse-1.0 anonymously and topped Artificial Analysis on both text-to-video and image-to-video before claiming it. PixVerse R1 is a real-time interactive world model, roughly 2s latency at 1080p. Seedance 2.0 named as primary competitor.

For creative production: HappyHorse-1.0 is now a quality leader and worth a test pass.

Read at Recode China AI → link