model release

MiniMax Launches M3 Model With 1M Context Window at $0.30 Per Million Input Tokens

TL;DR

MiniMax has released M3, a multimodal foundation model supporting text, image, and video inputs with a 1-million-token context window. The model costs $0.30 per million input tokens and $1.20 per million output tokens, available through OpenRouter.

2 min read
0

MiniMax M3 — Quick Specs

Context window1000K tokens
Input$0.3/1M tokens
Output$1.2/1M tokens

MiniMax Launches M3 Model With 1M Context Window at $0.30 Per Million Input Tokens

MiniMax has released M3, a multimodal foundation model that processes text, image, and video inputs with a 1-million-token context window. The model costs $0.30 per million input tokens (50% launch discount) and $1.20 per million output tokens (50% launch discount), available through OpenRouter.

Technical Architecture

M3 is built on MiniMax Sparse Attention (MSA), which replaces full attention mechanisms with KV-block selection. According to MiniMax, this reduces per-token compute costs to approximately 1/20th of the previous generation at 1M tokens while maintaining quality across most tasks. The company claims substantially faster prefill and decode speeds compared to traditional full attention models.

The model was trained as a native multimodal system on interleaved data and fine-tuned using an interactive user-simulator framework designed for multi-turn, production-like collaboration.

Target Use Cases

MiniMax positions M3 for:

  • Long-horizon agentic workflows requiring sustained context
  • Coding tasks with large codebases
  • Tool use and function calling
  • Multi-step tasks requiring extended reasoning chains

The model outputs text only, despite accepting multimodal inputs.

Pricing Context

At current 50% discounted rates, M3's input pricing of $0.30 per million tokens undercuts most long-context competitors. For comparison:

  • Claude 3.5 Sonnet (200K context): $3.00 input / $15.00 output per million tokens
  • GPT-4 Turbo (128K context): $10.00 input / $30.00 output per million tokens
  • Gemini 1.5 Pro (2M context): $1.25 input / $5.00 output per million tokens (for prompts under 128K)

The model's full-price rates ($0.60 input / $2.40 output) would still position it as cost-competitive for extended context applications.

What This Means

M3 represents MiniMax's entry into the ultra-long-context market with a focus on cost efficiency through architectural innovation. The sparse attention approach directly addresses the computational bottleneck that has made million-token contexts prohibitively expensive for most applications. However, the model's quality at scale and real-world performance on complex multimodal tasks remains to be validated by independent benchmarks. The emphasis on multi-step agentic workflows suggests MiniMax is targeting enterprise automation and developer tooling markets rather than consumer chatbot applications.

Related Articles

model release

StepFun launches Step 3.7 Flash: 196B MoE model with 256K context and adjustable reasoning levels at $0.20/$1.15 per 1M

StepFun has released Step 3.7 Flash, a 196B-parameter Mixture-of-Experts model that activates approximately 11B parameters per token. The multimodal model supports a 256K context window and introduces selectable reasoning levels (high/medium/low), priced at $0.20 per 1M input tokens and $1.15 per 1M output tokens.

model release

Anthropic releases Claude Opus 4.8 with improved agentic coding and reasoning benchmarks

Anthropic released Claude Opus 4.8 on May 28, 2026, with improved performance in agentic coding, computer use, and reasoning benchmarks. Pricing remains at $5 per million input tokens and $25 per million output tokens, while the model's fast mode is now three times cheaper than previous versions.

model release

Mistral Releases Medium 3.5: 128B Model with Cloud Coding Agents and 77.6% SWE-Bench Verified

Mistral AI released Medium 3.5, a 128B dense model with a 256k context window that scores 77.6% on SWE-Bench Verified. The model powers new remote coding agents in Mistral Vibe that run asynchronously in the cloud, plus a new Work mode in Le Chat for multi-step agentic tasks.

model release

Mistral AI Releases Small 4: 119B Parameter Open-Source Model with 256K Context Under Apache 2.0

Mistral AI has released Mistral Small 4, a 119B total parameter mixture-of-experts model with 256K context window and native multimodal capabilities. The model uses 128 experts with 4 active per token (6B active parameters) and is released under the Apache 2.0 license, marking Mistral's first unified model combining reasoning, multimodal, and coding capabilities.

Comments

Loading...