changelogDeepSeek

DeepSeek cuts V4 Pro pricing by 75% to $0.003625 per million input tokens

TL;DR

DeepSeek has permanently reduced pricing for its V4 Pro model by 75%, bringing input token costs down to $0.003625 per million tokens from $0.0145. The move makes permanent a promotional discount that was set to expire May 31, 2026.

2 min read
0

DeepSeek V4 Pro — Quick Specs

Context window1000K tokens
Input$0.0036/1M tokens
Output$0.87/1M tokens

DeepSeek cuts V4 Pro pricing by 75% to $0.003625 per million input tokens

DeepSeek has permanently reduced pricing for its flagship V4 Pro model by 75%, according to an update on the company's website. Input tokens now cost $0.003625 per million, down from $0.0145, while output tokens dropped to $0.87 per million from $3.48.

The price reduction makes permanent a promotional discount that was originally scheduled to end on May 31, 2026. DeepSeek released the V4 Pro and V4 Flash models in April 2026, claiming they would usher in an "era of cost-effective 1M context length."

Pricing comparison

The new pricing structure positions DeepSeek V4 Pro significantly below competing models:

  • DeepSeek V4 Pro: $0.003625 input / $0.87 output per 1M tokens
  • Previous DeepSeek V4 Pro pricing: $0.0145 input / $3.48 output per 1M tokens

The company describes its positioning as the "cost-effective" choice for AI agents, a strategy that could deliver substantial savings for enterprise accounts and power users processing millions of tokens daily.

Market context

DeepSeek's aggressive pricing comes amid increasing competition in the large language model market. The Chinese startup is now positioned as a lower-cost alternative to OpenAI's GPT-5 and Google's recently released Gemini 3.5 Flash, though specific pricing comparisons for those models were not provided.

The pricing strategy follows previous tensions with competitors. Anthropic has accused DeepSeek of "distillation attacks" — a practice where one company's model improperly learns from another's more capable system. The permanent price cuts may intensify these competitive dynamics.

What this means

DeepSeek's permanent 75% price reduction represents a significant escalation in AI model pricing competition. For enterprise users running high-volume workloads, the cost difference could be substantial — a workload requiring 1 billion tokens per day would now cost approximately $3,625 for input tokens instead of $14,500. However, buyers should evaluate whether DeepSeek's performance matches their requirements, as raw pricing doesn't account for differences in model capabilities, accuracy, or output quality. The move also raises questions about the sustainability of such pricing and whether competitors will respond with their own cuts.

Related Articles

model release

DeepSeek V4 Pro launches with 1.6 trillion parameters, 1M token context at $0.145 per million input tokens

Chinese AI lab DeepSeek has released preview versions of DeepSeek V4 Flash and V4 Pro, mixture-of-experts models with 1 million token context windows. The V4 Pro has 1.6 trillion total parameters (49 billion active), making it the largest open-weight model available, while both models significantly undercut frontier model pricing.

changelog

Google switches Gemini to compute-based limits, cuts AI Ultra to $100/month

Google is replacing Gemini's daily prompt limits with a compute-based system that factors in prompt complexity, features used, and chat length. Limits refresh every five hours until reaching a weekly cap. AI Ultra, aimed at developers and technical leads, now starts at $100/month—down from its previous entry point—with 5x higher usage limits than the Pro plan.

model release

DeepSeek Releases V4 Flash: 284B-Parameter MoE Model with 1M Context Window, Free via OpenRouter

DeepSeek has released V4 Flash, a Mixture-of-Experts model with 284B total parameters and 13B activated parameters per forward pass. The model supports a 1M-token context window and is available free through OpenRouter, targeting high-throughput coding and chat applications.

changelog

Anthropic releases Claude Opus 4.7 Fast with 6x pricing for higher output speed

Anthropic has released Claude Opus 4.7 Fast, a speed-optimized variant of its Opus 4.7 model. The fast-mode version delivers identical capabilities with higher output speed at premium pricing: $30 per 1M input tokens and $150 per 1M output tokens, representing a 6x increase over standard pricing.

Comments

Loading...