model releaseXiaomi

Xiaomi releases MiMo-V2-Pro with 1M context window and 1T+ parameters

TL;DR

Xiaomi released MiMo-V2-Pro on March 18, 2026, a flagship foundation model with over 1 trillion total parameters and a 1,048,576 token context window. The model is priced at $1 per million input tokens and $3 per million output tokens, positioning it as an agent-focused system comparable to top-tier models.

March 23, 2026 · 3:24 PM2 min read

Xiaomi MiMo-V2-Pro — Quick Specs

Context window1049K tokens

Input$1/1M tokens

Output$3/1M tokens

Compare Xiaomi MiMo-V2-Pro with other models →

Xiaomi Releases MiMo-V2-Pro With 1M Context Window and 1T+ Parameters

Xiaomi released MiMo-V2-Pro on March 18, 2026, a foundation model featuring over 1 trillion total parameters and a 1,048,576 token context window. The model is priced at $1 per million input tokens and $3 per million output tokens.

Technical Specifications

MiMo-V2-Pro is positioned as Xiaomi's flagship foundation model, designed primarily for agentic scenarios and complex workflow orchestration. The model features:

Context window: 1,048,576 tokens (1M)
Parameter count: Over 1 trillion total parameters
Input pricing: $1 per million tokens
Output pricing: $3 per million tokens
Release date: March 18, 2026

Benchmarks and Performance Claims

According to Xiaomi, MiMo-V2-Pro "ranks among the global top tier in the standard PinchBench and ClawBench benchmarks, with perceived performance approaching that of Opus 4.6." The company does not publish specific benchmark scores, instead relying on comparative claims against existing models.

OpenRouter's usage data shows the model handling 319 billion prompt tokens, 1.03 billion completion tokens, and 476 million reasoning tokens in recent tracking periods.

Agentic Focus and Integration

MiMo-V2-Pro is explicitly optimized for agent frameworks, with Xiaomi citing OpenClaw compatibility as a key feature. The company describes the model as "designed to serve as the brain of agent systems, orchestrating complex workflows, driving production engineering tasks, and delivering results reliably."

The model supports reasoning-enabled inference through OpenRouter, allowing access to step-by-step thinking processes via the reasoning parameter in API requests.

Availability and Distribution

MiMo-V2-Pro is accessible through OpenRouter, which handles provider routing and fallback mechanisms to maximize uptime. OpenRouter normalizes API requests and responses across multiple provider implementations.

What This Means

Xiaomi enters the large foundation model market with competitive context window sizing (matching or exceeding most current offerings) and aggressive pricing on the output tier ($3/M output tokens). The focus on agent-optimized design suggests Xiaomi is targeting enterprise automation workflows rather than general-purpose chat applications. The 1M context window places MiMo-V2-Pro in the extended-context category used by models handling document analysis and multi-turn agent reasoning, though specific benchmark data remains unavailable for direct performance comparison. Xiaomi's entry signals continued fragmentation in the foundation model market, with regional players (Chinese tech companies like Xiaomi, Alibaba/Qwen, Baidu, Tencent) establishing independent model lines alongside US-based competitors.

Source: openrouter.ai ↗

xiaomi foundation-model agent-ai large-context pricing model-release mimo-v2-pro

model releaseMay 7, 2026

Google releases Gemini 3.1 Flash Lite with 1M context at $0.25 per million input tokens

Google has released Gemini 3.1 Flash Lite, a high-efficiency multimodal model with a 1,048,576 token context window priced at $0.25 per million input tokens and $1.50 per million output tokens. The model supports text, image, video, audio, and PDF inputs with four thinking levels for cost-performance optimization.

model releaseMay 8, 2026

Tencent Releases Hy3 Preview: Mixture-of-Experts Model with 262K Context and Configurable Reasoning

Tencent has released Hy3 preview, a Mixture-of-Experts model with a 262,144 token context window priced at $0.066 per million input tokens and $0.26 per million output tokens. The model features three configurable reasoning modes—disabled, low, and high—designed for agentic workflows and production environments.

model releaseMay 8, 2026

Allen Institute releases EMO, 14B parameter MoE model with selective 12.5% expert use

Allen Institute for AI released EMO, a 1B-active, 14B-total-parameter mixture-of-experts model trained on 1 trillion tokens. The model uses 8 active experts per token from a pool of 128 total experts, and can maintain near full-model performance while using just 12.5% of its experts for specific tasks.

model releaseMay 8, 2026

InclusionAI Releases Ring-2.6-1T: 1 Trillion Parameter Thinking Model with 63B Active Parameters

InclusionAI has released Ring-2.6-1T, a 1 trillion parameter-scale model with 63 billion active parameters and a 262,144-token context window. The model features adaptive reasoning modes and is designed for coding agents, tool use, and long-horizon task execution.

Xiaomi releases MiMo-V2-Pro with 1M context window and 1T+ parameters

Xiaomi MiMo-V2-Pro — Quick Specs

Xiaomi Releases MiMo-V2-Pro With 1M Context Window and 1T+ Parameters

Technical Specifications

Benchmarks and Performance Claims

Agentic Focus and Integration

Availability and Distribution

What This Means

Related Articles

Google releases Gemini 3.1 Flash Lite with 1M context at $0.25 per million input tokens

Tencent Releases Hy3 Preview: Mixture-of-Experts Model with 262K Context and Configurable Reasoning

Allen Institute releases EMO, 14B parameter MoE model with selective 12.5% expert use

InclusionAI Releases Ring-2.6-1T: 1 Trillion Parameter Thinking Model with 63B Active Parameters

Comments