model releaseDeepSeek

DeepSeek Releases V4-Pro-Base with 1.6 Trillion Parameters

TL;DR

DeepSeek has released DeepSeek-V4-Pro-Base, a 1.6 trillion parameter foundation model now available on Hugging Face. The base model uses BF16 precision for weights and includes support for F8_E4M3, I64, and F32 tensor types.

April 24, 2026 · 4:50 AM1 min read

DeepSeek Releases V4-Pro-Base with 1.6 Trillion Parameters

DeepSeek has released DeepSeek-V4-Pro-Base, a 1.6 trillion parameter foundation model now available on Hugging Face. The base model weights are distributed in BF16 precision format.

Technical Specifications

The model includes 1.6 trillion parameters and supports multiple tensor types: BF16 (Brain Floating Point 16), I64 (64-bit integer), F32 (32-bit floating point), and F8_E4M3 (8-bit floating point). The model files are available in the Safetensors format.

As a base model, DeepSeek-V4-Pro-Base is designed for fine-tuning rather than direct deployment. No inference providers have yet added support for hosting the model.

Availability

The model is part of a 4-item collection on Hugging Face that has received 225 interactions. DeepSeek has not disclosed context window size, benchmark scores, or pricing information at the time of release.

The model card on Hugging Face does not include training cutoff dates, architecture details, or performance metrics. Download statistics for the first month are not yet available.

What This Means

At 1.6 trillion parameters, DeepSeek-V4-Pro-Base represents one of the largest openly available foundation models. The "Pro-Base" designation suggests this is an untuned variant intended for research and custom fine-tuning rather than production use. The absence of immediate inference provider support and limited documentation indicates this is an early-stage release targeting the research community and developers who will build instruction-tuned or task-specific variants. The size places it in direct competition with other frontier models, though performance comparisons cannot be made without published benchmarks.

Source: huggingface.co ↗

DeepSeek base-model open-source foundation-model 1.6T-parameters

model releaseApril 24, 2026

DeepSeek Releases V4-Flash-Base: 292B Parameter Base Model

DeepSeek has released V4-Flash-Base, a 292 billion parameter base model now available on Hugging Face. The model uses BF16, I64, F32, and F8_E4M3 tensor types and is distributed in Safetensors format.

model releaseApril 24, 2026

DeepSeek releases V4 model preview with agent optimization, pricing undisclosed

DeepSeek released a preview of its V4 large language model on April 24, 2026, available in 'pro' and 'flash' versions. The Hangzhou-based company claims the open-source model achieves strong performance on agent-based tasks and has been optimized for tools like Anthropic's Claude Code and OpenClaw.

model releaseApril 24, 2026

DeepSeek Releases V4-Flash: 284B-Parameter MoE Model With 1M Token Context at 27% Inference Cost

DeepSeek released two Mixture-of-Experts models: V4-Flash with 284B total parameters (13B activated) and V4-Pro with 1.6T parameters (49B activated). Both models support one million token context windows and use a hybrid attention architecture that requires only 27% of the inference FLOPs compared to DeepSeek-V3.2 at 1M token context.

model releaseApril 24, 2026

DeepSeek Releases V4-Pro: 1.6T Parameter MoE Model with 1M Token Context

DeepSeek released two new Mixture-of-Experts models: DeepSeek-V4-Pro with 1.6 trillion parameters (49B activated) and DeepSeek-V4-Flash with 284B parameters (13B activated), both supporting one million token context length. The models achieve 27% of inference FLOPs and 10% of KV cache compared to DeepSeek-V3.2 at 1M context through a hybrid attention architecture combining Compressed Sparse Attention and Heavily Compressed Attention.

DeepSeek Releases V4-Pro-Base with 1.6 Trillion Parameters

DeepSeek Releases V4-Pro-Base with 1.6 Trillion Parameters

Technical Specifications

Availability

What This Means

Related Articles

DeepSeek Releases V4-Flash-Base: 292B Parameter Base Model

DeepSeek releases V4 model preview with agent optimization, pricing undisclosed

DeepSeek Releases V4-Flash: 284B-Parameter MoE Model With 1M Token Context at 27% Inference Cost

DeepSeek Releases V4-Pro: 1.6T Parameter MoE Model with 1M Token Context

Comments