model releaseDeepSeek

DeepSeek Releases V4 Pro: 1.6T Parameter MoE Model with 1M Token Context at $1.74/M Input Tokens

TL;DR

DeepSeek has released V4 Pro, a Mixture-of-Experts model with 1.6 trillion total parameters and 49 billion activated parameters. The model supports a 1-million-token context window and costs $1.74 per million input tokens and $3.48 per million output tokens.

April 24, 2026 · 4:21 AM2 min read

DeepSeek V4 Pro — Quick Specs

Context window1049K tokens

Input$1.74/1M tokens

Output$3.48/1M tokens

Compare DeepSeek V4 Pro with other models →

DeepSeek Releases V4 Pro: 1.6T Parameter MoE Model with 1M Token Context

DeepSeek has released V4 Pro, a large-scale Mixture-of-Experts model with 1.6 trillion total parameters and 49 billion activated parameters, supporting a 1-million-token context window. The model is priced at $1.74 per million input tokens and $3.48 per million output tokens.

Architecture and Capabilities

According to DeepSeek, V4 Pro is built on the same architecture as DeepSeek V4 Flash and introduces a hybrid attention system designed for efficient long-context processing. The model supports multiple reasoning modes that allow users to balance speed and depth depending on task requirements.

The company claims the model delivers strong performance across knowledge, mathematics, and software engineering benchmarks, though specific benchmark scores have not been disclosed.

Target Use Cases

DeepSeek positions V4 Pro for complex workloads including:

Full-codebase analysis
Multi-step automation
Large-scale information synthesis
Advanced reasoning tasks
Long-horizon agent workflows

The 1-million-token context window enables processing of entire codebases or lengthy documents in a single inference call.

Pricing and Availability

V4 Pro is available through OpenRouter as of April 24, 2026. At $1.74 per million input tokens, it sits in the mid-range pricing tier for frontier models. The 2:1 output-to-input pricing ratio ($3.48 vs $1.74) is standard for models with generation-heavy workloads.

Technical Details

The Mixture-of-Experts architecture activates 49 billion parameters per forward pass while maintaining 1.6 trillion total parameters. This approach aims to provide capabilities comparable to dense models of similar active parameter count while reducing computational costs.

OpenRouter integration includes support for DeepSeek's reasoning modes, with developers able to access step-by-step thinking processes through the reasoning_details array in API responses.

What This Means

V4 Pro represents DeepSeek's entry into the ultra-long-context market dominated by models like Anthropic's Claude and Google's Gemini. The 1M token context window and competitive pricing make it viable for enterprise use cases requiring analysis of large documents or codebases. However, without published benchmark scores, direct performance comparisons to established models remain unclear. The MoE architecture suggests DeepSeek is prioritizing inference efficiency alongside capability, a trend across Chinese AI labs competing with Western frontier model providers.

Source: openrouter.ai ↗

DeepSeek V4 Pro MoE Mixture-of-Experts long context 1M tokens reasoning models coding models

model releaseApril 24, 2026

DeepSeek V4 Flash Released: 284B Parameter MoE Model with 1M Context Window at $0.14 per Million Tokens

DeepSeek has released V4 Flash, a Mixture-of-Experts model with 284B total parameters and 13B activated parameters per request. The model supports a 1,048,576-token context window and is priced at $0.14 per million input tokens and $0.28 per million output tokens.

model releaseApril 23, 2026

Tencent Releases Hy3 Preview MoE Model with 262K Context and Three Reasoning Modes

Tencent has released Hy3 Preview, a Mixture-of-Experts model offering 262,144 token context window and three configurable reasoning modes (disabled, low, high) for production agentic workflows. The model is available for free through OpenRouter.

model releaseApril 24, 2026

DeepSeek Releases V4-Flash: 284B-Parameter MoE Model With 1M Token Context at 27% Inference Cost

DeepSeek released two Mixture-of-Experts models: V4-Flash with 284B total parameters (13B activated) and V4-Pro with 1.6T parameters (49B activated). Both models support one million token context windows and use a hybrid attention architecture that requires only 27% of the inference FLOPs compared to DeepSeek-V3.2 at 1M token context.

model releaseApril 24, 2026