model releaseXiaomi

Xiaomi Launches MiMo-V2.5 With 1M Context Window at $0.40 per Million Input Tokens

TL;DR

Xiaomi released MiMo-V2.5 on April 22, 2026, a native omnimodal model with a 1,048,576 token context window. The model is priced at $0.40 per million input tokens and $2 per million output tokens, positioning it as a cost-efficient alternative for agentic applications requiring multimodal perception across image and video understanding.

2 min read
1

Xiaomi Launches MiMo-V2.5 With 1M Context Window at $0.40 per Million Input Tokens

Xiaomi released MiMo-V2.5 on April 22, 2026, a native omnimodal model featuring a 1,048,576 token context window priced at $0.40 per million input tokens and $2 per million output tokens.

Specifications and Pricing

MiMo-V2.5 offers:

  • Context window: 1,048,576 tokens (1M)
  • Input pricing: $0.40 per million tokens
  • Output pricing: $2 per million tokens
  • Release date: April 22, 2026

According to Xiaomi, the model delivers "Pro-level agentic performance at roughly half the inference cost" compared to unspecified alternatives, though the company has not provided independent benchmark scores to verify these claims.

Technical Capabilities

Xiaomi describes MiMo-V2.5 as a "native omnimodal model" designed for multimodal perception across image and video understanding tasks. The company claims the model surpasses its predecessor, MiMo-V2-Omni, in multimodal perception, though specific benchmark comparisons were not disclosed.

The 1M context window is designed to handle complete documents, extended conversations, and complex task contexts in a single inference pass. Xiaomi positions this capability as particularly suited for integration with agent frameworks.

Availability

The model is currently available through OpenRouter, which routes requests across multiple providers to optimize uptime and handle varying prompt sizes. OpenRouter supports the model's reasoning capabilities through a dedicated reasoning parameter that exposes step-by-step thinking processes via a reasoning_details array in API responses.

What This Means

MiMo-V2.5 enters an increasingly competitive omnimodal model market with a clear value proposition: extended context at lower input pricing than many enterprise-tier alternatives. At $0.40 per million input tokens, it undercuts several comparable models while offering a 1M context window—a specification typically reserved for premium tiers.

The focus on agentic workflows suggests Xiaomi is targeting developers building autonomous systems that require sustained reasoning across multimodal inputs. However, without published benchmark scores on standard evaluation sets like MMLU, VQAv2, or video understanding benchmarks, independent assessment of the model's claimed performance advantages remains difficult. The model's effectiveness will ultimately be determined by real-world deployment results in production agent systems.

Related Articles

model release

Alibaba's Qwen Releases Qwen3.7 Plus: 1M Context Window at $0.40 Per Million Input Tokens

Alibaba's Qwen has released Qwen3.7 Plus, a multimodal model with a 1 million token context window. The model accepts text and image input with text output, priced at $0.40 per million input tokens and $1.60 per million output tokens through OpenRouter's API.

model release

Ideogram 4: 9.3B parameter open-weight text-to-image model with native 2K resolution and structured JSON prompting

Ideogram has released Ideogram 4, its first open-weight text-to-image model with 9.3 billion parameters. The model supports native 2K resolution, structured JSON prompting with bounding-box layout controls, and is available in nf4 and fp8 quantizations under a non-commercial license.

model release

Microsoft releases MAI-Thinking-1, its first reasoning AI model trained without third-party distillation

Microsoft announced MAI-Thinking-1, its first advanced reasoning AI model, at Build 2026. The company claims it's a medium-sized model matching leading models on key software engineering benchmarks, trained from scratch without distillation from third-party models.

model release

NVIDIA Releases Nemotron-3-Ultra: 550B Parameter Model with 1M Token Context and Configurable Reasoning

NVIDIA released Nemotron-3-Ultra-550B-A55B-NVFP4, a 550B parameter model with 55B active parameters, featuring a 1M token context window and configurable reasoning mode. The model uses a hybrid LatentMoE architecture combining Mamba-2, Mixture-of-Experts, and Attention layers with Multi-Token Prediction, trained with NVIDIA's NVFP4 quantization-aware approach.

Comments

Loading...