Mistral Medium 3 launches at $0.4/$2 per million tokens, matching 90% of Claude 3.7 Sonnet performance
Mistral AI launched Mistral Medium 3 on May 7, 2025, priced at $0.4 per million input tokens and $2 per million output tokens. The company claims the model performs at or above 90% of Claude Sonnet 3.7 on benchmarks while being significantly less expensive, and surpasses Llama 4 Maverick and Cohere Command A.
Mistral Medium 3 launches at $0.4/$2 per million tokens, matching 90% of Claude 3.7 Sonnet performance
Mistral AI launched Mistral Medium 3 on May 7, 2025, priced at $0.4 per million input tokens and $2 per million output tokens. According to Mistral, the model performs at or above 90% of Claude Sonnet 3.7 on benchmarks while costing 8x less.
Pricing and deployment
Mistral Medium 3 is available immediately on Mistral La Plateforme and Amazon Sagemaker, with IBM WatsonX, NVIDIA NIM, Azure AI Foundry, and Google Cloud Vertex support coming soon. The model can be self-hosted on environments with four GPUs or more.
The company claims the model beats DeepSeek v3 on pricing for both API and self-deployed systems, though specific comparative figures were not provided.
Performance claims
According to Mistral's internal benchmarks, Medium 3 surpasses:
- Llama 4 Maverick (open source)
- Cohere Command A (enterprise model)
- Claude Sonnet 3.7 at 90% of performance in most categories
Mistral states the model "comes close to its very large and much slower competitors" in coding and STEM tasks, and that third-party human evaluations show "much better performance" in coding compared to larger models. Specific benchmark scores were not disclosed in the announcement.
Enterprise features
Mistral Medium 3 supports:
- Hybrid, on-premises, or in-VPC deployment
- Custom post-training and continuous pretraining
- Full fine-tuning capabilities
- Integration with enterprise knowledge bases
Beta customers in financial services, energy, and healthcare are testing the model for customer service, business process personalization, and complex dataset analysis, according to Mistral.
Model architecture
Mistral did not disclose parameter count, context window size, training data cutoff date, or detailed architecture specifications for Medium 3.
What this means
Mistral Medium 3 represents aggressive pricing in the enterprise AI model market, undercutting competitors by significant margins if the performance claims hold. At $0.4/$2 per million tokens with claimed near-parity to Claude Sonnet 3.7, this pricing could pressure other providers to reduce costs or differentiate on capabilities beyond benchmarks.
The emphasis on self-hosting and enterprise customization positions Medium 3 for organizations requiring on-premises deployment or extensive fine-tuning—use cases where API-only models face adoption barriers. Mistral's teaser about a "large" model launch in coming weeks suggests a three-tier lineup (Small, Medium, Large) to compete across market segments.
The lack of disclosed benchmark scores, parameter count, and context window makes independent verification of claims impossible at launch. Performance relative to Claude 3.7 and Llama 4 Maverick will need external validation.
Related Articles
Mistral OCR 3 launches at $2 per 1,000 pages with 74% win rate over previous version
Mistral AI released OCR 3, a document parsing model priced at $2 per 1,000 pages with a 50% batch API discount. The company claims a 74% overall win rate compared to Mistral OCR 2 on forms, scanned documents, complex tables, and handwriting.
Mistral Releases OCR API at $1 per 1,000 Pages, Claims 94.89% Accuracy on Document Benchmarks
Mistral AI has released an OCR API priced at $1 per 1,000 pages with batch inference costs approximately half that rate. The company claims 94.89% overall accuracy on internal benchmarks, ahead of GPT-4o (89.77%), Gemini 2.0 Flash (88.69%), and Azure OCR (89.52%). The model processes up to 2,000 pages per minute on a single node.
Mistral Launches Saba: 24B-Parameter Regional Model for Arabic and South Asian Languages
Mistral AI has released Saba, a 24B-parameter model trained specifically for Arabic and South Asian languages including Tamil. The model runs on single-GPU systems at over 150 tokens per second and is available via API or for on-premises deployment.
Mistral rebrands Le Chat to Vibe, launches autonomous coding agent and work automation platform
Mistral AI has rebranded Le Chat as Vibe, introducing two new agent modes: Work Mode for multi-step business tasks across connected apps, and Code Mode for autonomous coding from pull request to merge. The service includes a new VS Code extension and starts at $14.99/month for Pro tier.
Comments
Loading...