Mistral Releases Mistral Large 3 with 675B Parameters and Three Ministral 3 Models Under Apache 2.0
Mistral AI has released Mistral 3, consisting of Mistral Large 3—a sparse mixture-of-experts model with 675B total parameters and 41B active parameters—and three Ministral 3 models at 3B, 8B, and 14B parameters. All models are released under the Apache 2.0 license with multimodal capabilities including image understanding.
Mistral Releases Mistral Large 3 with 675B Parameters and Three Ministral 3 Models Under Apache 2.0
Mistral AI has released Mistral 3, consisting of Mistral Large 3—a sparse mixture-of-experts model with 675B total parameters and 41B active parameters—and three Ministral 3 models at 3B, 8B, and 14B parameters. All models are released under the Apache 2.0 license with multimodal capabilities including image understanding.
Mistral Large 3 Technical Specifications
Mistral Large 3 is a sparse mixture-of-experts architecture trained from scratch on 3,000 NVIDIA H200 GPUs. The model uses 41B active parameters and 675B total parameters, making it Mistral's first MoE model since the Mixtral series.
According to Mistral AI, the model ranks #2 in the open-source non-reasoning models category on the LMArena leaderboard (#6 among all open-source models overall). The company claims the instruction-tuned version achieves parity with the best instruction-tuned open-weight models on general prompts while demonstrating what it calls "best-in-class performance" on multilingual conversations in languages other than English and Chinese.
Both base and instruction fine-tuned versions are available under Apache 2.0. A reasoning variant is announced as coming soon.
Ministral 3 Series Details
The Ministral 3 series includes three model sizes: 3B, 8B, and 14B parameters. For each size, Mistral releases base, instruct, and reasoning variants—all with multimodal image understanding capabilities under Apache 2.0.
Mistral AI claims the Ministral 3 reasoning 14B variant achieves 85% accuracy on AIME 2025. The company states that instruct models "match or exceed the performance of comparable models while often producing an order of magnitude fewer tokens."
Infrastructure and Deployment
All Mistral 3 models were trained on NVIDIA Hopper GPUs with HBM3e memory. Mistral collaborated with NVIDIA, vLLM, and Red Hat to optimize deployment:
- Mistral Large 3 can run on a single 8×A100 or 8×H100 node using vLLM
- A checkpoint in NVFP4 format built with llm-compressor is available
- NVIDIA integrated Blackwell attention and MoE kernels for the sparse architecture
- Support for prefill/decode disaggregated serving and speculative decoding on GB200 NVL72
- Ministral models optimized for NVIDIA DGX Spark, RTX PCs, and Jetson edge devices
Inference support is enabled through TensorRT-LLM and SGLang for the complete model family.
Availability
Mistral 3 is available immediately on Mistral AI Studio, Amazon Bedrock, Azure Foundry, Hugging Face, Modal, IBM WatsonX, OpenRouter, Fireworks, Unsloth AI, and Together AI. NVIDIA NIM and AWS SageMaker availability is listed as coming soon.
Pricing information has not been disclosed. Model documentation and research papers are available through Mistral AI's documentation hub and Hugging Face.
What This Means
Mistral Large 3's 675B parameter count with 41B active parameters positions it as one of the largest openly-licensed MoE models available. The Apache 2.0 license removes commercial restrictions that limit other "open" models. The simultaneous release of smaller Ministral variants (3B-14B) with reasoning capabilities addresses the growing demand for edge deployment and cost-efficient inference, though independent verification of Mistral's performance claims on multilingual tasks and token efficiency will be necessary to confirm competitive positioning.
Related Articles
Mistral AI Releases Small 4: 119B Parameter Open-Source Model with 256K Context Under Apache 2.0
Mistral AI has released Mistral Small 4, a 119B total parameter mixture-of-experts model with 256K context window and native multimodal capabilities. The model uses 128 experts with 4 active per token (6B active parameters) and is released under the Apache 2.0 license, marking Mistral's first unified model combining reasoning, multimodal, and coding capabilities.
Mistral AI Releases Voxtral: Apache 2.0 Speech Models with 32K Token Context at $0.001/Minute
Mistral AI released Voxtral, a family of open-source speech understanding models available in 24B and 3B parameter variants under Apache 2.0 license. The models support up to 32K token context (30 minutes of audio for transcription, 40 minutes for understanding) and are priced at $0.001 per minute via API—less than half the cost of comparable proprietary systems according to Mistral.
Mistral releases Leanstral, 6B-parameter open-source model for Lean 4 formal proof verification
Mistral AI released Leanstral, the first open-source code agent designed specifically for Lean 4 formal proof verification. The model uses 6B active parameters in a sparse 120B architecture and is available under Apache 2.0 license with free API access.
Mistral releases Devstral Medium and Small 1.1 with 61.6% SWE-Bench Verified score
Mistral AI has released two specialized coding models: Devstral Medium, achieving 61.6% on SWE-Bench Verified, and Devstral Small 1.1, scoring 53.6% and released under Apache 2.0 license. The company claims Devstral Medium surpasses Gemini 2.5 Pro and GPT-4.1 at a quarter of the price.
Comments
Loading...