Mistral Medium 3 launches at $0.4/$2 per million tokens, matching 90% of Claude 3.7 Sonnet performance

TL;DR

Mistral AI launched Mistral Medium 3 on May 7, 2025, priced at $0.4 per million input tokens and $2 per million output tokens. The company claims the model performs at or above 90% of Claude Sonnet 3.7 on benchmarks while being significantly less expensive, and surpasses Llama 4 Maverick and Cohere Command A.

May 28, 2026 · 9:38 AM2 min read

Mistral Medium 3 — Quick Specs

Compare Mistral Medium 3 with other models →

Mistral Medium 3 launches at $0.4/$2 per million tokens, matching 90% of Claude 3.7 Sonnet performance

Mistral AI launched Mistral Medium 3 on May 7, 2025, priced at $0.4 per million input tokens and $2 per million output tokens. According to Mistral, the model performs at or above 90% of Claude Sonnet 3.7 on benchmarks while costing 8x less.

Pricing and deployment

Mistral Medium 3 is available immediately on Mistral La Plateforme and Amazon Sagemaker, with IBM WatsonX, NVIDIA NIM, Azure AI Foundry, and Google Cloud Vertex support coming soon. The model can be self-hosted on environments with four GPUs or more.

The company claims the model beats DeepSeek v3 on pricing for both API and self-deployed systems, though specific comparative figures were not provided.

Performance claims

According to Mistral's internal benchmarks, Medium 3 surpasses:

Llama 4 Maverick (open source)
Cohere Command A (enterprise model)
Claude Sonnet 3.7 at 90% of performance in most categories

Mistral states the model "comes close to its very large and much slower competitors" in coding and STEM tasks, and that third-party human evaluations show "much better performance" in coding compared to larger models. Specific benchmark scores were not disclosed in the announcement.

Enterprise features

Mistral Medium 3 supports:

Hybrid, on-premises, or in-VPC deployment
Custom post-training and continuous pretraining
Full fine-tuning capabilities
Integration with enterprise knowledge bases

Beta customers in financial services, energy, and healthcare are testing the model for customer service, business process personalization, and complex dataset analysis, according to Mistral.

Model architecture

Mistral did not disclose parameter count, context window size, training data cutoff date, or detailed architecture specifications for Medium 3.

What this means

Mistral Medium 3 represents aggressive pricing in the enterprise AI model market, undercutting competitors by significant margins if the performance claims hold. At $0.4/$2 per million tokens with claimed near-parity to Claude Sonnet 3.7, this pricing could pressure other providers to reduce costs or differentiate on capabilities beyond benchmarks.

The emphasis on self-hosting and enterprise customization positions Medium 3 for organizations requiring on-premises deployment or extensive fine-tuning—use cases where API-only models face adoption barriers. Mistral's teaser about a "large" model launch in coming weeks suggests a three-tier lineup (Small, Medium, Large) to compete across market segments.

The lack of disclosed benchmark scores, parameter count, and context window makes independent verification of claims impossible at launch. Performance relative to Claude 3.7 and Llama 4 Maverick will need external validation.

Source: mistral.ai ↗

mistral-ai mistral-medium-3 model-release pricing enterprise-ai self-hosting benchmarks

model releaseJuly 9, 2026

OpenAI releases GPT-5.6 family in three sizes: Luna at $1/$6, Terra at $2.50/$15, Sol at $5/$30 per 1M tokens

OpenAI released its GPT-5.6 flagship model family in three sizes: Luna ($1/$6 per 1M tokens), Terra ($2.50/$15), and Sol ($5/$30). The company claims GPT-5.6 Sol scores 53.6 on the Agents' Last Exam benchmark, outperforming Claude Fable 5's score by 13.1 points.

model releaseJuly 9, 2026

OpenAI announces GPT-5.6 with three models (Sol, Terra, Luna) and ChatGPT Work agent tool

OpenAI released GPT-5.6 in three model tiers—Sol (flagship reasoning), Terra (mainstream), and Luna (instant)—positioning them against Anthropic's Claude models. The company claims GPT-5.6 Sol scores 53.6 on Agents' Last Exam, 13.1 points above Claude Fable 5, while completing tasks 61% faster. ChatGPT Work, a desktop productivity agent similar to Claude Cowork, launches simultaneously for Pro, Enterprise, and Edu users.

model releaseJuly 9, 2026

OpenAI Releases GPT-5.6 Luna Pro with Extended Reasoning Mode at $1/$6 Per Million Tokens

OpenAI has released GPT-5.6 Luna Pro, a reasoning-enhanced variant of GPT-5.6 Luna with a 1 million token context window. The model is priced at $1 per million input tokens and $6 per million output tokens, with a knowledge cutoff date of February 2026.

model releaseJuly 9, 2026

OpenAI Releases GPT-5.6 Sol Pro with Extended Reasoning Mode at $5 Input/$30 Output per 1M Tokens

OpenAI has released GPT-5.6 Sol Pro, a reasoning-enhanced variant of GPT-5.6 Sol designed for complex tasks. The model features a 1 million token context window, February 2026 knowledge cutoff, and is priced at $5 per 1M input tokens and $30 per 1M output tokens.

Mistral Medium 3 launches at $0.4/$2 per million tokens, matching 90% of Claude 3.7 Sonnet performance

Mistral Medium 3 — Quick Specs

Mistral Medium 3 launches at $0.4/$2 per million tokens, matching 90% of Claude 3.7 Sonnet performance

Pricing and deployment

Performance claims

Enterprise features

Model architecture

What this means

Related Articles

OpenAI releases GPT-5.6 family in three sizes: Luna at $1/$6, Terra at $2.50/$15, Sol at $5/$30 per 1M tokens

OpenAI announces GPT-5.6 with three models (Sol, Terra, Luna) and ChatGPT Work agent tool

OpenAI Releases GPT-5.6 Luna Pro with Extended Reasoning Mode at $1/$6 Per Million Tokens

OpenAI Releases GPT-5.6 Sol Pro with Extended Reasoning Mode at $5 Input/$30 Output per 1M Tokens

Comments