DeepSeek cuts V4 Pro pricing by 75% to $0.003625 per million input tokens
DeepSeek has permanently reduced pricing for its V4 Pro model by 75%, bringing input token costs down to $0.003625 per million tokens from $0.0145. The move makes permanent a promotional discount that was set to expire May 31, 2026.
DeepSeek V4 Pro — Quick Specs
DeepSeek cuts V4 Pro pricing by 75% to $0.003625 per million input tokens
DeepSeek has permanently reduced pricing for its flagship V4 Pro model by 75%, according to an update on the company's website. Input tokens now cost $0.003625 per million, down from $0.0145, while output tokens dropped to $0.87 per million from $3.48.
The price reduction makes permanent a promotional discount that was originally scheduled to end on May 31, 2026. DeepSeek released the V4 Pro and V4 Flash models in April 2026, claiming they would usher in an "era of cost-effective 1M context length."
Pricing comparison
The new pricing structure positions DeepSeek V4 Pro significantly below competing models:
- DeepSeek V4 Pro: $0.003625 input / $0.87 output per 1M tokens
- Previous DeepSeek V4 Pro pricing: $0.0145 input / $3.48 output per 1M tokens
The company describes its positioning as the "cost-effective" choice for AI agents, a strategy that could deliver substantial savings for enterprise accounts and power users processing millions of tokens daily.
Market context
DeepSeek's aggressive pricing comes amid increasing competition in the large language model market. The Chinese startup is now positioned as a lower-cost alternative to OpenAI's GPT-5 and Google's recently released Gemini 3.5 Flash, though specific pricing comparisons for those models were not provided.
The pricing strategy follows previous tensions with competitors. Anthropic has accused DeepSeek of "distillation attacks" — a practice where one company's model improperly learns from another's more capable system. The permanent price cuts may intensify these competitive dynamics.
What this means
DeepSeek's permanent 75% price reduction represents a significant escalation in AI model pricing competition. For enterprise users running high-volume workloads, the cost difference could be substantial — a workload requiring 1 billion tokens per day would now cost approximately $3,625 for input tokens instead of $14,500. However, buyers should evaluate whether DeepSeek's performance matches their requirements, as raw pricing doesn't account for differences in model capabilities, accuracy, or output quality. The move also raises questions about the sustainability of such pricing and whether competitors will respond with their own cuts.
Related Articles
DeepSeek V4 Pro launches with 1.6 trillion parameters, 1M token context at $0.145 per million input tokens
Chinese AI lab DeepSeek has released preview versions of DeepSeek V4 Flash and V4 Pro, mixture-of-experts models with 1 million token context windows. The V4 Pro has 1.6 trillion total parameters (49 billion active), making it the largest open-weight model available, while both models significantly undercut frontier model pricing.
Google switches Gemini to compute-based limits, cuts AI Ultra to $100/month
Google is replacing Gemini's daily prompt limits with a compute-based system that factors in prompt complexity, features used, and chat length. Limits refresh every five hours until reaching a weekly cap. AI Ultra, aimed at developers and technical leads, now starts at $100/month—down from its previous entry point—with 5x higher usage limits than the Pro plan.
DeepSeek Releases V4 Flash: 284B-Parameter MoE Model with 1M Context Window, Free via OpenRouter
DeepSeek has released V4 Flash, a Mixture-of-Experts model with 284B total parameters and 13B activated parameters per forward pass. The model supports a 1M-token context window and is available free through OpenRouter, targeting high-throughput coding and chat applications.
Anthropic releases Claude Opus 4.7 Fast with 6x pricing for higher output speed
Anthropic has released Claude Opus 4.7 Fast, a speed-optimized variant of its Opus 4.7 model. The fast-mode version delivers identical capabilities with higher output speed at premium pricing: $30 per 1M input tokens and $150 per 1M output tokens, representing a 6x increase over standard pricing.
Comments
Loading...