Xiaomi releases MiMo-V2-Pro with 1M context window and 1T+ parameters
Xiaomi released MiMo-V2-Pro on March 18, 2026, a flagship foundation model with over 1 trillion total parameters and a 1,048,576 token context window. The model is priced at $1 per million input tokens and $3 per million output tokens, positioning it as an agent-focused system comparable to top-tier models.
Xiaomi MiMo-V2-Pro — Quick Specs
Xiaomi Releases MiMo-V2-Pro With 1M Context Window and 1T+ Parameters
Xiaomi released MiMo-V2-Pro on March 18, 2026, a foundation model featuring over 1 trillion total parameters and a 1,048,576 token context window. The model is priced at $1 per million input tokens and $3 per million output tokens.
Technical Specifications
MiMo-V2-Pro is positioned as Xiaomi's flagship foundation model, designed primarily for agentic scenarios and complex workflow orchestration. The model features:
- Context window: 1,048,576 tokens (1M)
- Parameter count: Over 1 trillion total parameters
- Input pricing: $1 per million tokens
- Output pricing: $3 per million tokens
- Release date: March 18, 2026
Benchmarks and Performance Claims
According to Xiaomi, MiMo-V2-Pro "ranks among the global top tier in the standard PinchBench and ClawBench benchmarks, with perceived performance approaching that of Opus 4.6." The company does not publish specific benchmark scores, instead relying on comparative claims against existing models.
OpenRouter's usage data shows the model handling 319 billion prompt tokens, 1.03 billion completion tokens, and 476 million reasoning tokens in recent tracking periods.
Agentic Focus and Integration
MiMo-V2-Pro is explicitly optimized for agent frameworks, with Xiaomi citing OpenClaw compatibility as a key feature. The company describes the model as "designed to serve as the brain of agent systems, orchestrating complex workflows, driving production engineering tasks, and delivering results reliably."
The model supports reasoning-enabled inference through OpenRouter, allowing access to step-by-step thinking processes via the reasoning parameter in API requests.
Availability and Distribution
MiMo-V2-Pro is accessible through OpenRouter, which handles provider routing and fallback mechanisms to maximize uptime. OpenRouter normalizes API requests and responses across multiple provider implementations.
What This Means
Xiaomi enters the large foundation model market with competitive context window sizing (matching or exceeding most current offerings) and aggressive pricing on the output tier ($3/M output tokens). The focus on agent-optimized design suggests Xiaomi is targeting enterprise automation workflows rather than general-purpose chat applications. The 1M context window places MiMo-V2-Pro in the extended-context category used by models handling document analysis and multi-turn agent reasoning, though specific benchmark data remains unavailable for direct performance comparison. Xiaomi's entry signals continued fragmentation in the foundation model market, with regional players (Chinese tech companies like Xiaomi, Alibaba/Qwen, Baidu, Tencent) establishing independent model lines alongside US-based competitors.
Related Articles
Google releases Gemini 3.1 Flash Lite with 1M context at $0.25 per million input tokens
Google has released Gemini 3.1 Flash Lite, a high-efficiency multimodal model with a 1,048,576 token context window priced at $0.25 per million input tokens and $1.50 per million output tokens. The model supports text, image, video, audio, and PDF inputs with four thinking levels for cost-performance optimization.
Tencent Releases Hy3 Preview: Mixture-of-Experts Model with 262K Context and Configurable Reasoning
Tencent has released Hy3 preview, a Mixture-of-Experts model with a 262,144 token context window priced at $0.066 per million input tokens and $0.26 per million output tokens. The model features three configurable reasoning modes—disabled, low, and high—designed for agentic workflows and production environments.
Allen Institute releases EMO, 14B parameter MoE model with selective 12.5% expert use
Allen Institute for AI released EMO, a 1B-active, 14B-total-parameter mixture-of-experts model trained on 1 trillion tokens. The model uses 8 active experts per token from a pool of 128 total experts, and can maintain near full-model performance while using just 12.5% of its experts for specific tasks.
InclusionAI Releases Ring-2.6-1T: 1 Trillion Parameter Thinking Model with 63B Active Parameters
InclusionAI has released Ring-2.6-1T, a 1 trillion parameter-scale model with 63 billion active parameters and a 262,144-token context window. The model features adaptive reasoning modes and is designed for coding agents, tool use, and long-horizon task execution.
Comments
Loading...