model releaseMistral AI

Mistral releases Leanstral, 6B-parameter open-source model for Lean 4 formal proof verification

TL;DR

Mistral AI released Leanstral, the first open-source code agent designed specifically for Lean 4 formal proof verification. The model uses 6B active parameters in a sparse 120B architecture and is available under Apache 2.0 license with free API access.

3 min read
0

Mistral releases Leanstral, 6B-parameter open-source model for Lean 4 formal proof verification

Mistral AI released Leanstral, the first open-source code agent designed specifically for Lean 4 formal proof verification. The model uses 6B active parameters in a sparse 120B total parameter architecture and is available under Apache 2.0 license with free API access.

Lean 4 is a proof assistant capable of expressing complex mathematical objects and software specifications. According to Mistral, Leanstral is trained for operating in realistic formal repositories rather than isolated mathematical problems.

Architecture and availability

Leanstral uses a highly sparse architecture with 120B total parameters and 6B active parameters. The model is available through three channels:

  • Open-source weights under Apache 2.0 license
  • Integration in Mistral Vibe (activated with /leanstall command)
  • Free API endpoint labs-leanstral-2603

Mistral states the free API endpoint will remain "highly accessible for a limited period" to gather feedback data. The company will release a technical report detailing training approaches and FLTEval, a new evaluation suite.

Benchmark performance

Mistral evaluated Leanstral on FLTEval, which tests completing formal proofs and defining mathematical concepts in pull requests to the Fermat's Last Theorem (FLT) project. The benchmark compares against Claude Opus 4.6, Sonnet 4.6, Haiku 4.5, and open-source models including Qwen3.5 397B-A17B, Kimi-K2.5 1T-A32B, and GLM5 744B-A40B.

According to Mistral:

  • Leanstral single pass: 21.9 score, $18 cost
  • Leanstral pass@2: 26.3 score, $36 cost (beats Sonnet 4.6's 23.7 at $549)
  • Leanstral pass@16: 31.9 score, $290 cost
  • Claude Opus 4.6: 39.6 score, $1,650 cost
  • Qwen3.5 397B-A17B: 25.4 score at pass@4
  • GLM5 744B-A40B: 16.6 score cap
  • Kimi-K2.5 1T-A32B: 20.1 score cap

Mistral used Mistral Vibe as the evaluation scaffold with no benchmark-specific modifications.

Demonstrated capabilities

Mistral provided two case studies. In the first, Leanstral diagnosed a breaking change in Lean 4.29.0-rc6 involving the rw tactic failing with type aliases. The model identified that def creates rigid definitions requiring explicit unfolding, blocking the tactic, and correctly proposed switching to abbrev for transparent aliases.

In the second case study, Leanstral converted program definitions from Rocq (from a Princeton CS441 course) to Lean 4, implementing custom notation and proving properties about programs in the language.

Integration features

Leanstral supports Model Context Protocol (MCP) through Mistral Vibe and was trained for optimal performance with lean-lsp-mcp. Users can access the model in Vibe by pressing Shift+Tab to cycle to Leanstral or using vibe --agent lean command.

What this means

Leanstral represents a significant efficiency advance in formal verification tooling. A 6B active parameter model matching or exceeding models with 17B-40B active parameters on formal proof tasks suggests architectural optimizations specific to proof verification may be more impactful than raw parameter count. The Apache 2.0 license and free API access lower barriers to formal verification adoption, though the model still trails Claude Opus 4.6 by 7.7 points at comparable cost levels. The shift from isolated competition math problems to full repository PR completion in FLTEval provides a more realistic benchmark for production formal verification workflows.

Related Articles

model release

Mistral AI Releases Small 4: 119B Parameter Open-Source Model with 256K Context Under Apache 2.0

Mistral AI has released Mistral Small 4, a 119B total parameter mixture-of-experts model with 256K context window and native multimodal capabilities. The model uses 128 experts with 4 active per token (6B active parameters) and is released under the Apache 2.0 license, marking Mistral's first unified model combining reasoning, multimodal, and coding capabilities.

model release

Mistral Releases Mistral Large 3 with 675B Parameters and Three Ministral 3 Models Under Apache 2.0

Mistral AI has released Mistral 3, consisting of Mistral Large 3—a sparse mixture-of-experts model with 675B total parameters and 41B active parameters—and three Ministral 3 models at 3B, 8B, and 14B parameters. All models are released under the Apache 2.0 license with multimodal capabilities including image understanding.

model release

Mistral AI Releases Voxtral: Apache 2.0 Speech Models with 32K Token Context at $0.001/Minute

Mistral AI released Voxtral, a family of open-source speech understanding models available in 24B and 3B parameter variants under Apache 2.0 license. The models support up to 32K token context (30 minutes of audio for transcription, 40 minutes for understanding) and are priced at $0.001 per minute via API—less than half the cost of comparable proprietary systems according to Mistral.

changelog

Mistral Releases Codestral 25.08 with 30% Higher Completion Acceptance, Ships Full Enterprise Coding Stack

Mistral AI released Codestral 25.08, showing 30% more accepted code completions and 10% higher retention rates. The company also shipped Devstral Small, a 24B-parameter agentic coding model scoring 53.6% on SWE-Bench Verified, alongside new embedding and IDE integration tools aimed at enterprise deployment.

Comments

Loading...