DeepSeek V4 Pro

DeepSeek🇨🇳 China
active
Context window1000K tokens
Input / 1M tokens$0.0036
Output / 1M tokens$0.87

Version History

v4-promajor

DeepSeek permanently reduced V4 Pro pricing by 75%, dropping input tokens to $0.003625 per million and output tokens to $0.87 per million. Previously promotional pricing is now permanent.

v4-pro-previewmajor

Major release with 1.6T total parameters, 1M token context window, and substantial efficiency improvements over V3.2. DeepSeek claims near-frontier performance at a fraction of the cost.

v4major

DeepSeek V4 introduces two model sizes with major architectural improvements: hybrid attention mechanisms reduce memory usage by 9.5x-13.7x while supporting 1M token contexts, and mixed FP4/FP8 precision further reduces memory footprint. Added Huawei Ascend NPU support.

Benchmark Scores

Full leaderboard →
76.8%
HumanEval
93.5%
LiveCodeBench
64.5%
MATH
90.1%
MMLU
87.5%
MMLU-Pro

Coverage

model releaseDeepSeek

DeepSeek V4 cuts inference costs with 1.6T parameter model using 13.7x less memory than V3

DeepSeek released V4 in two versions: a 284 billion parameter Flash model and a 1.6 trillion parameter Pro model with 49 billion active parameters. According to DeepSeek, the models use 9.5x-13.7x less memory than V3 through compressed attention mechanisms and FP4/FP8 mixed precision, while supporting a 1 million token context window.

2 min read
model releaseDeepSeek

DeepSeek V4 Pro launches with 1.6 trillion parameters, 1M token context at $0.145 per million input tokens

Chinese AI lab DeepSeek has released preview versions of DeepSeek V4 Flash and V4 Pro, mixture-of-experts models with 1 million token context windows. The V4 Pro has 1.6 trillion total parameters (49 billion active), making it the largest open-weight model available, while both models significantly undercut frontier model pricing.

2 min read
model releaseDeepSeek

DeepSeek V4 Pro launches with 1.6T parameters at $1.74/M tokens, undercutting Claude Sonnet 4.6 by 42%

DeepSeek released two preview models: V4 Pro (1.6T total parameters, 49B active) and V4 Flash (284B total, 13B active), both with 1 million token context windows. V4 Pro is priced at $1.74/M input tokens and $3.48/M output—42% cheaper than Claude Sonnet 4.6—while V4 Flash at $0.14/$0.28 per million tokens undercuts all small frontier models.

2 min read
model releaseDeepSeek

DeepSeek Releases V4-Pro: 1.6T Parameter MoE Model with 1M Token Context

DeepSeek released two new Mixture-of-Experts models: DeepSeek-V4-Pro with 1.6 trillion parameters (49B activated) and DeepSeek-V4-Flash with 284B parameters (13B activated), both supporting one million token context length. The models achieve 27% of inference FLOPs and 10% of KV cache compared to DeepSeek-V3.2 at 1M context through a hybrid attention architecture combining Compressed Sparse Attention and Heavily Compressed Attention.

2 min read