- DeepSeek has made its 75% discount on V4-Pro permanent, lowering prices to $0.435 per million input tokens and $0.87 per million output tokens.
- Xiaomi continues to shock the market by cutting MiMo-V2.5 API prices by up to 99%, with token cache costs as low as $0.0036 per million tokens on the Pro version.
- Xiaomi’s $100 Max package now offers 82 billion tokens, a massive increase from the previous 1.6 billion, equivalent to over 60 billion words.
- According to Xiaomi, KV Cache improvements reduced storage and processing costs by about 80%, while maintaining near-breakeven operations.
- DeepSeek V4-Pro scored 80.6% on SWE-Verified, close to Claude Opus 4.6 (80.8%), but its output cost is about 34 times lower.
- GPT-5.5 increased its output price to $30 per million tokens, while Claude Opus 4.7’s new tokenizer could increase actual token counts by up to 35%.
- Other advanced Chinese models like MiniMax M2.7, Kimi K2.5, and GLM-5.1 are also significantly cheaper than their US competitors.
- The cost gap between leading Chinese and US AI models currently ranges from 15 to 30 times, and even more for AI Agent applications using cache frequently.
📌 The AI war is shifting from performance competition to cost competition. DeepSeek and Xiaomi are not just cutting prices by a few percent but are driving AI costs down by up to 98–99% compared to many leading US models. With performance nearing GPT and Claude but prices being dozens of times lower, enterprises deploying AI Agents, document processing, and large-scale automation have strong incentives to switch to open-source or Chinese models to significantly reduce operating costs.
