DeepSeek V4 runs on a quarter of the compute. The Huawei pivot cost them senior staff.
Two solid V4 post-mortems landed same day. Read together: ChinAI on the architecture, ChinaTalk on the org rot.
ChinAI #356: DeepSeek as Road Builder [修路人]
V4-Pro at 1.6T params needs 27% of V3.2's per-token inference FLOPs and 10% of the KV cache, with 1M-token context. The Engram architecture is the sharper number: 80GB VRAM tasks compressed to 8GB. DeepSeek is shipping a new DSL called TileLang and pushing Huawei and Cambricon as launch partners on the inference side, a deliberate end-run around CUDA. Jeff frames the lab as a road builder, not a product shop, which is a useful contrast against Kimi.
Efficiency numbers you can actually plug into a deployment plan, plus the clearest signal yet that domestic-chip inference is becoming a first-class target.
DeepSeek V4
The reason V4 slipped: a failed training run during the Nvidia-to-Ascend migration, layered on top of internal disagreements under Liang Wenfeng. Multimodal generation was scoped out for compute and cash reasons, and DeepSeek opened an external financing window in mid-April 2026 after losing senior people across LLM, agents, OCR, and multimodal teams to Tencent, ByteDance, Xiaomi, and DeepRoute. The authors call V4 three to six months behind the frontier; ChinaTalk thinks it's further.
The counterweight to the road-builder narrative. Hardware sovereignty has a price, and DeepSeek is paying it in talent and time.
Skim pile
- AI真能搞钱了!这家公司把大模型玩成闭环赚钱机器 · QbitAI (ZH) · Lingxi Technology claims full AI replacement of human sales agents in insurance/finance, billing on a Results-as-a-Service model with Chery and Yuanfudao as clients · 58