model releaseDeepSeek

DeepSeek Releases V4-Flash-Base: 292B Parameter Base Model

TL;DR

DeepSeek has released V4-Flash-Base, a 292 billion parameter base model now available on Hugging Face. The model uses BF16, I64, F32, and F8_E4M3 tensor types and is distributed in Safetensors format.

1 min read
0

DeepSeek V4-Flash-Base: 292B Parameter Base Model Released

DeepSeek has released V4-Flash-Base, a 292 billion parameter base model now available on Hugging Face. The model represents the base version of DeepSeek's V4-Flash series.

Technical Specifications

The model contains 292 billion parameters and supports multiple tensor types: BF16 (bfloat16), I64 (64-bit integer), F32 (32-bit float), and F8_E4M3 (8-bit float in E4M3 format). Files are distributed in the Safetensors format, which provides safer serialization than traditional pickle-based formats.

Availability and Deployment

The model weights are available for download on Hugging Face as part of a collection containing 4 items. According to the Hugging Face listing, no inference providers currently support deployment of this model. The collection was last updated approximately 4 hours ago and has 307 downloads.

Missing Information

DeepSeek has not yet published a model card with detailed information about training data, benchmark performance, capabilities, or pricing. Context window size, training cutoff date, and specific use cases remain undisclosed. As a base model, V4-Flash-Base typically requires fine-tuning for specific tasks, unlike instruction-tuned variants.

What This Means

The release of a 292B parameter base model signals DeepSeek's continued development of large-scale models, though the lack of documentation makes technical evaluation impossible at this stage. The "Flash" designation suggests optimization for speed, consistent with other models in the industry using similar naming conventions. The use of F8_E4M3 tensor types indicates potential support for efficient inference through quantization. Without benchmark scores or a detailed model card, organizations should wait for complete documentation before considering deployment.

Related Articles

model release

DeepSeek V4 Pro launches with 1.6T parameters at $1.74/M tokens, undercutting Claude Sonnet 4.6 by 42%

DeepSeek released two preview models: V4 Pro (1.6T total parameters, 49B active) and V4 Flash (284B total, 13B active), both with 1 million token context windows. V4 Pro is priced at $1.74/M input tokens and $3.48/M output—42% cheaper than Claude Sonnet 4.6—while V4 Flash at $0.14/$0.28 per million tokens undercuts all small frontier models.

model release

DeepSeek releases V4 model preview with agent optimization, pricing undisclosed

DeepSeek released a preview of its V4 large language model on April 24, 2026, available in 'pro' and 'flash' versions. The Hangzhou-based company claims the open-source model achieves strong performance on agent-based tasks and has been optimized for tools like Anthropic's Claude Code and OpenClaw.

model release

DeepSeek Releases V4-Pro-Base with 1.6 Trillion Parameters

DeepSeek has released DeepSeek-V4-Pro-Base, a 1.6 trillion parameter foundation model now available on Hugging Face. The base model uses BF16 precision for weights and includes support for F8_E4M3, I64, and F32 tensor types.

model release

DeepSeek Releases V4 Pro: 1.6T Parameter MoE Model with 1M Token Context at $1.74/M Input Tokens

DeepSeek has released V4 Pro, a Mixture-of-Experts model with 1.6 trillion total parameters and 49 billion activated parameters. The model supports a 1-million-token context window and costs $1.74 per million input tokens and $3.48 per million output tokens.

Comments

Loading...