DeepSeek V4 cache pricing cut: 83% off agent coding bills

4 articles scored today

Pricing floor drops again on V4, Tencent ships Hermes support in QClaw, and Anthropic posts a three-bug Claude Code postmortem.

DeepSeek V4 permanent price cut: cache hits at 1/10

QbitAI · zh · sebmeter 82

Cache-hit input tokens are now billed at one tenth of the already-reduced rate, stacking on the earlier 75 percent cut to base input/output pricing. A 35M-token agent coding run came in at ¥5.34 versus ¥31.73 before, an 83 percent drop. V4-Pro reportedly hits cache around 96 percent of the time in practice, so almost every billable token lands in the cheapest bucket. The article also flags DeepSeek's plan to scale Huawei compute in H2, which points to further cuts.

Direct hit on agency infra economics: the unit cost of running coding agents on Chinese stacks just collapsed again.

Read at QbitAI → qbitai.com

Skim pile

Send a short note about your brand, your markets, and your stack — we’ll come back with whether this fits and what version of the system makes sense. First call is a scoping conversation, not a pitch.

Kiri Media AB
Kungstensgatan 27
113 57 Stockholm
Sweden
Contact
sebastian@kirimedia.co +46 8 000 00 00
Explore
SEO agency Google Ads agency Meta Ads agency AI marketing Guides