Mistral AI fine-tunes Pixtral-12B on satellite imagery, boosting classification accuracy from 56% to 91%

TL;DR

Mistral AI reports that fine-tuning its Pixtral-12B vision model on satellite imagery increased classification accuracy from 56% to 91% on the Aerial Image Dataset. The company used LoRA (Low-Rank Adaptation) to train on 8,000 samples for under $10, reducing hallucinations from 5% to 0.1%.

May 28, 2026 · 9:53 AM2 min read

Mistral AI fine-tunes Pixtral-12B on satellite imagery, boosting classification accuracy from 56% to 91%

Mistral AI reports that fine-tuning its Pixtral-12B vision model on satellite imagery increased classification accuracy from 56% to 91% on the Aerial Image Dataset, demonstrating how domain-specific adaptation can dramatically improve model performance on specialized tasks.

The experiment

Mistral used the publicly available Aerial Image Dataset (AID), splitting it into 8,000 training samples and 2,000 test samples across 30 scene categories. The categories included challenging distinctions like dense residential versus medium residential areas, and ambiguous terms like "center."

The base Pixtral-12B model, using only prompt engineering with a system prompt listing all target classes, achieved 56% accuracy. The model also hallucinated invalid class names 5% of the time, producing labels not in the original set.

Fine-tuning approach

Mistral applied Low-Rank Adaptation (LoRA), which injects small trainable matrices into the model's weights rather than retraining the entire model. According to Mistral, the fine-tuning required no extensive hyperparameter tuning and cost under $10 to complete.

The company used its fine-tuning API, training the model by providing correct labels for system prompts and input images. Mistral recommends starting with a single epoch and monitoring for overfitting risk.

Results

After fine-tuning on the 8,000 training samples:

Overall accuracy: 91% (up from 56%)
Hallucination rate: 0.1% (down from 5%)
Performance became more consistent across all 30 classes

Mistral highlighted a specific example where the base model confused "Playground" and "Stadium" categories, both classified incorrectly as "Stadium." The fine-tuned model learned to distinguish them based on features like the presence of seating surrounding sports fields.

Technical details

Mistral's LoRA implementation allows developers to adapt models without modifying full model weights. The technique proves particularly useful when prompt engineering produces inconsistent results or when dealing with complex, domain-specific visual patterns.

The company offers two fine-tuning options: direct API calls for granular hyperparameter control, or the LaPlateforme UI, which automatically computes optimal batch size based on dataset size.

What this means

This research demonstrates that vision-language models can achieve significant performance gains on specialized visual domains through relatively inexpensive fine-tuning. The 1.6x accuracy improvement on satellite imagery with just 8,000 samples suggests similar approaches could work for other underrepresented visual domains in VLM training sets, such as medical imaging, surveillance analysis, or manuscript transcription.

The under-$10 cost and single-epoch training make this approach accessible for organizations with limited budgets working on specialized computer vision tasks. Mistral has published a cookbook with full implementation details at github.com/mistralai/cookbook.

Source: mistral.ai ↗

Mistral AI Pixtral-12B fine-tuning LoRA computer vision satellite imagery vision-language models

researchJuly 6, 2026

AWS introduces rDPO unlearning technique to reduce false content moderation in Amazon Nova models by 53 percentage point

AWS has developed Reverse Direct Preference Optimization (rDPO), a novel unlearning technique that reduces over-deflection in Amazon Nova models by up to 53 percentage points. The approach allows organizations to selectively adjust content moderation safeguards while preserving general model capabilities through LoRA adapters.

model releaseJuly 3, 2026

Mistral Releases Leanstral 1.5: 6B-Parameter Model Achieves 100% on miniF2F, Solves 587/672 PutnamBench Problems

Mistral AI released Leanstral 1.5, a free Apache-2.0 licensed model with 119B total parameters and 6B active parameters specialized for formal verification in Lean 4. The model achieves 100% on miniF2F benchmark, solves 587 of 672 PutnamBench problems at $4 per problem (versus $300+ for competitors), and reaches state-of-the-art 87% on FATE-H and 34% on FATE-X benchmarks.

product updateJune 18, 2026

Mistral Adds 20+ MCP Connectors and Memory Features to Le Chat, All Free

Mistral released 20+ MCP-powered connectors for Le Chat, integrating tools like Databricks, Snowflake, GitHub, Stripe, and Asana. The update includes a memory feature that saves user preferences across conversations, with one-click import from ChatGPT. All features are available on the free plan.

researchJuly 8, 2026

NVIDIA Releases 10 Trillion Tokens of Open Agentic Training Data, Launches Interactive Prompt Atlas

NVIDIA has released over 10 trillion pre-training tokens and millions of post-training samples as part of its Nemotron open data initiative for building AI agents. The release includes the Nemotron Post-Training v3 Prompt Atlas, an interactive visualization tool, and Nemotron-Personas dataset representing 2.4 billion people across 10 countries.

Mistral AI fine-tunes Pixtral-12B on satellite imagery, boosting classification accuracy from 56% to 91%

Mistral AI fine-tunes Pixtral-12B on satellite imagery, boosting classification accuracy from 56% to 91%

The experiment

Fine-tuning approach

Results

Technical details

What this means

Related Articles

AWS introduces rDPO unlearning technique to reduce false content moderation in Amazon Nova models by 53 percentage point

Mistral Releases Leanstral 1.5: 6B-Parameter Model Achieves 100% on miniF2F, Solves 587/672 PutnamBench Problems

Mistral Adds 20+ MCP Connectors and Memory Features to Le Chat, All Free

NVIDIA Releases 10 Trillion Tokens of Open Agentic Training Data, Launches Interactive Prompt Atlas

Comments