DeepSeek V4 cache pricing cut: 83% off agent coding bills
Pricing floor drops again on V4, Tencent ships Hermes support in QClaw, and Anthropic posts a three-bug Claude Code postmortem.
DeepSeek V4 permanent price cut: cache hits at 1/10
Cache-hit input tokens are now billed at one tenth of the already-reduced rate, stacking on the earlier 75 percent cut to base input/output pricing. A 35M-token agent coding run came in at ¥5.34 versus ¥31.73 before, an 83 percent drop. V4-Pro reportedly hits cache around 96 percent of the time in practice, so almost every billable token lands in the cheapest bucket. The article also flags DeepSeek's plan to scale Huawei compute in H2, which points to further cuts.
Direct hit on agency infra economics: the unit cost of running coding agents on Chinese stacks just collapsed again.
Read at QbitAI → qbitai.com
Skim pile
- Tencent QClaw 0.2.14 adds Hermes, DeepSeek V4, Hunyuan 3 · 36Kr AI · agent framework support plus new connectors and Tencent Docs collaboration · 62
- Anthropic confirms three bugs behind Claude Code degradation · QbitAI · silent reasoning downgrade, cache wipe, and a 25-word output cap, all rolled back · 52
- Huawei ADS 5 with WEWA 2.0 world model, RMB 18B 2026 spend · QbitAI · multi-agent cloud world engine, Qiankun OS, online RL with claimed 10x efficiency · 45