MiniMax releases M2.7, a 229B parameter model with self-evolving capabilities and agent teams
MiniMax has released MiniMax-M2.7, a 229-billion parameter model that uniquely participates in its own evolution during development. The model achieves 66.6% medal rate on MLE Bench Lite and 56.22% on SWE-Pro benchmarks, with native support for multi-agent collaboration and complex tool orchestration.
MiniMax Releases M2.7: Open-Source 229B Model With Self-Evolution Capability
MiniMax has released MiniMax-M2.7, a 229-billion parameter open-source model that introduces a self-evolution cycle, allowing the model to autonomously improve its own learning process during development.
Model Self-Evolution
M2.7's defining feature is its participation in its own development. During training, the model autonomously updated its memory, built dozens of complex skills for reinforcement learning experiments, and improved its learning process based on experimental results. In one internal benchmark, an M2.7 instance optimized a programming scaffold over 100+ iterations—analyzing failure trajectories, modifying code, running evaluations, and deciding whether to keep or revert changes—achieving a 30% performance improvement.
Software Engineering and System-Level Reasoning
M2.7 demonstrates strong capabilities in production engineering tasks spanning log analysis, bug troubleshooting, refactoring, and security analysis. The model shows system-level reasoning across monitoring metrics, trace analysis, and root-cause verification. MiniMax reports reducing live production incident recovery time to under three minutes on multiple occasions using M2.7.
On software engineering benchmarks:
- SWE-Pro: 56.22% (matching GPT-5.3-Codex)
- SWE Multilingual: 76.5%
- Multi SWE Bench: 52.7%
- Terminal Bench 2: 57.0%
- NL2Repo: 39.8%
- VIBE-Pro: 55.6% (near Opus 4.6 parity)
On machine learning competitions, M2.7 achieved 66.6% medal rate on MLE Bench Lite (22 competitions), second only to Opus-4.6 and GPT-5.4 according to MiniMax's claims.
Multi-Agent Capabilities and Professional Work
M2.7 natively supports Agent Teams for multi-agent collaboration with stable role identity and autonomous decision-making. The model demonstrates capability in document editing (Word, Excel, PowerPoint) with high-fidelity multi-round editing and produces editable deliverables.
On professional work benchmarks:
- GDPval-AA: 1495 ELO score (highest among open-source models)
- Toolathon: 46.3% accuracy
- MM Claw end-to-end: 62.7% (close to Sonnet 4.6)
- MM Claw skill compliance: 97% across 40+ complex skills
Deployment and Access
MiniMax-M2.7 is available as open-source weights on Hugging Face (229B parameters, supporting F32, BF16, and F8_E4M3 tensor formats). The model can be deployed via SGLang, vLLM, Transformers, or NVIDIA NIM endpoints.
MiniMax also provides API access through MiniMax Agent (agent.minimax.io) and the MiniMax API platform (platform.minimax.io). The company offers an interactive demo called OpenRoom (openroom.ai) featuring real-time visual feedback and scene interactions.
Recommended inference parameters: temperature=1.0, top_p=0.95, top_k=40.
What This Means
M2.7 represents a shift toward models that can improve themselves during development—a capability that could accelerate iteration cycles for specialized tasks. The strong SWE and system engineering benchmarks position it competitively against closed-source reasoning models on production engineering work. The open-source release makes these capabilities accessible for self-hosted deployments, though claims about self-evolution achieving 30% improvements and MLE Bench performance warrant independent verification. The multi-agent framework and tool compliance metrics suggest MiniMax is targeting enterprise automation workflows alongside development tooling.
Related Articles
Tencent Releases Hy-MT2 Translation Models: 1.8B, 7B, and 30B-A3B Support 33 Languages
Tencent released Hy-MT2, a family of multilingual translation models available in 1.8B, 7B, and 30B-A3B (MoE) sizes. All models support translation among 33 languages and follow translation instructions in multiple languages. The 1.8B model can be compressed to 440MB using 1.25-bit AngelSlim quantization.
Tencent Releases Hy-MT2: 1.8B Translation Model Compressed to 440MB With 1.25-Bit Quantization
Tencent has open-sourced Hy-MT2, a family of multilingual translation models available in 1.8B, 7B, and 30B-A3B parameter sizes. The models support translation across 33 languages and include extreme quantization down to 1.25-bit, reducing the 1.8B model to 440MB storage while increasing inference speed by 1.5x.
Cohere Releases Command A+ Open Source Model with 25B Active Parameters, 128K Context
Cohere has released Command A+ as an open source model under Apache 2.0 license. The sparse mixture-of-experts architecture features 25 billion active parameters out of 218B total parameters, supports 128K input context length, and includes vision capabilities alongside tool use and reasoning features.
Cohere Releases Command A+: 218B-Parameter MoE Model With 4-Bit Quantization Runs on Single B200 GPU
Cohere has released Command A+, an open-source sparse mixture-of-experts model with 218 billion total parameters and 25 billion active parameters. The model features W4A4 quantization allowing deployment on a single Nvidia B200 GPU, supports 128K input context, and includes built-in chain-of-thought reasoning with vision capabilities.
Comments
Loading...