Mistral Releases Codestral Embed, Code-Specialized Embedding Model at $0.15 Per Million Tokens

TL;DR

Mistral AI has released Codestral Embed, its first code-specialized embedding model, priced at $0.15 per million tokens. The model features an 8192-token context window and claims to outperform Voyage Code 3, Cohere Embed v4.0, and OpenAI's large embedding model on code retrieval benchmarks.

June 18, 2026 · 8:40 AM2 min read

Codestral Embed — Quick Specs

Context window8K tokens

Input$0.15/1M tokens

Compare Codestral Embed with other models →

Mistral Releases Codestral Embed, Code-Specialized Embedding Model at $0.15 Per Million Tokens

Mistral AI has released Codestral Embed, its first embedding model specialized for code retrieval and semantic understanding. The model is available via API under the identifier codestral-embed-2505 at $0.15 per million tokens, with a 50% discount available through Mistral's batch API.

Technical Specifications

Codestral Embed supports an 8192-token context window and offers flexible output configurations. The model can generate embeddings at different dimensions and precisions, with dimensions ordered by relevance. According to Mistral, even at reduced dimension 256 with int8 precision, the model outperforms competing code embedding models.

For retrieval use cases, Mistral recommends chunking datasets into 3000-character segments with 1000-character overlap rather than using the full context window, as larger chunks reportedly degrade retrieval performance.

Benchmark Performance

Mistral claims Codestral Embed outperforms Voyage Code 3, Cohere Embed v4.0, and OpenAI's large embedding model across multiple code retrieval benchmarks. The company highlights particularly strong performance on:

SWE-Bench Lite: Real-world GitHub issue resolution, identifying files requiring modification
CodeSearchNet: Code-to-code and documentation-to-code retrieval from GitHub repositories
Text2SQL tasks: Spider, WikiSQL, and synthetic SQL generation benchmarks
Algorithm challenges: APPS, CodeChef, MBPP+, and competitive programming problems
Data science: DS-1000 matching questions to implementations

Mistral evaluated the model across categories including SWE-Bench, code-to-code retrieval, Text2Code from GitHub data, Text2SQL, algorithmic problems, and data science tasks. Specific numerical scores were not disclosed in the announcement.

Use Cases

The company positions Codestral Embed for four primary applications:

Retrieval-augmented generation: Context retrieval for code completion, editing, and explanation in coding assistants and agent frameworks
Semantic code search: Natural language or code-based queries across documentation and development tools
Duplicate detection: Identifying functionally similar code segments for reuse optimization and license enforcement
Code analytics: Unsupervised clustering for repository analysis and architecture pattern identification

Availability

Codestral Embed is accessible through Mistral's API and batch API. On-premises deployments require direct contact with Mistral's applied AI team. The company has published documentation and a cookbook with examples for code agent retrieval implementations.

What This Means

Codestral Embed represents Mistral's first specialized product for developer tooling, competing directly with established code embedding models from Voyage AI, Cohere, and OpenAI. The $0.15 per million token pricing positions it competitively in the embedding market, though direct cost comparisons depend on competitors' pricing for comparable dimensions and precision levels. The model's flexible dimension reduction feature could provide cost savings for applications where storage efficiency matters more than maximum retrieval accuracy. Its strong claimed performance on SWE-Bench Lite, a benchmark using real GitHub issues, suggests potential utility for coding agents and automated software engineering tools.

Source: mistral.ai ↗

mistral-ai embedding-model code-generation retrieval benchmarks api-pricing

model releaseAugust 2, 2026

Anthropic's Claude Opus 5 Generates Full 3D Games From a Single Text Prompt, No Assets Required

Anthropic's Claude Opus 5 can generate playable 3D games, including first-person shooters and Minecraft clones, from a single text prompt with zero external assets. Community tests claim it outperforms GPT-5.6 Sol and Kimi K3 in physics realism and mechanical complexity, though no standardized benchmark has confirmed the comparisons.

model releaseAugust 1, 2026

ByteDance's Seedance 2.5 Generates 30-Second AI Video Clips With Synced Audio

ByteDance released Seedance 2.5, an AI video model that generates synchronized video and audio in a single pass, producing clips up to 30 seconds long that can be extended further. That's roughly triple the length of Google's Gemini Omni Flash.

model releaseAugust 1, 2026

OpenAI Reportedly Developing 'Astra' Model Family for Multi-Day Autonomous Problem-Solving

OpenAI is reportedly developing a new model family called Astra, designed to coordinate multiple agents on complex problems over hours or days. The models are already in testing and would be first to go through a planned U.S. government pre-release review, according to The Information.

model releaseJuly 31, 2026

Google DeepMind Launches Gemini Robotics 2, a Single VLA Model for Arms to Humanoids

Google DeepMind has introduced Gemini Robotics 2, a vision-language-action model it calls its most advanced yet, designed to control everything from tabletop robot arms to full-body humanoids. The company also released Gemini Robotics ER 2, an embodied reasoning model that replaces ER 1.6.

Mistral Releases Codestral Embed, Code-Specialized Embedding Model at $0.15 Per Million Tokens

Codestral Embed — Quick Specs

Mistral Releases Codestral Embed, Code-Specialized Embedding Model at $0.15 Per Million Tokens

Technical Specifications

Benchmark Performance

Use Cases

Availability

What This Means

Related Articles

Anthropic's Claude Opus 5 Generates Full 3D Games From a Single Text Prompt, No Assets Required

ByteDance's Seedance 2.5 Generates 30-Second AI Video Clips With Synced Audio

OpenAI Reportedly Developing 'Astra' Model Family for Multi-Day Autonomous Problem-Solving

Google DeepMind Launches Gemini Robotics 2, a Single VLA Model for Arms to Humanoids

Comments