OpenAI releases GPT-4o mini with 128K context at $0.15/$0.60 per 1M tokens
OpenAI released GPT-4o mini on July 18, 2024, a compact multimodal model with 128,000 token context window priced at $0.15 per million input tokens and $0.60 per million output tokens. The model achieves 82% on MMLU and claims to rank higher than GPT-4 on chat preference leaderboards while costing 60% less than GPT-3.5 Turbo.
GPT-4o mini — Quick Specs
OpenAI Releases GPT-4o mini with 128K Context and Aggressive Pricing
OpenAI introduced GPT-4o mini on July 18, 2024, positioning it as the company's most capable small model and direct successor to GPT-3.5 Turbo. The model arrives with significant cost reduction and expanded context handling.
Model Specifications
GPT-4o mini supports multimodal inputs, accepting both text and images while producing text outputs. The model features a 128,000 token context window—a 4x increase over GPT-3.5 Turbo's 32K limit.
Pricing starts at $0.15 per million input tokens and $0.60 per million output tokens. OpenAI claims this represents a 60% cost reduction compared to GPT-3.5 Turbo, making it significantly cheaper than other recent frontier models.
Performance Claims
GPT-4o mini achieves 82% on MMLU, OpenAI's benchmark of choice for measuring broad knowledge. According to the company, the model "presently ranks higher than GPT-4 on chat preferences common leaderboards," though specific leaderboard names and methodologies are not detailed in the launch materials.
OpenAI characterizes GPT-4o mini as maintaining "SOTA intelligence"—state-of-the-art reasoning—while delivering dramatic cost efficiency gains. The model represents a clear positioning strategy: maintain competitive performance on standard benchmarks while underpricing alternatives in the small-to-medium model category.
Market Context
GPT-4o mini arrives as major AI labs compete for developer adoption through aggressive pricing. The model sits between ultra-lightweight models (like GPT-3.5 Turbo) and OpenAI's flagship offerings, addressing the significant market segment where cost sensitivity and capability requirements intersect.
By July 2024, this pricing tier had become increasingly crowded. The aggressive unit economics suggest OpenAI prioritizes market share and API adoption over near-term margin optimization in this segment.
Deployment Status
GPT-4o mini is available through OpenAI's API and multiple third-party providers including OpenRouter, which routes requests across multiple backends for redundancy.
What This Means
GPT-4o mini signals OpenAI's confidence in its ability to scale multimodal models efficiently while maintaining performance parity with flagship systems. The 128K context window and aggressive 60% cost reduction versus GPT-3.5 Turbo create a compelling value proposition for production applications where both capability and cost matter. The MMLU benchmark alone (82%) does not definitively prove superiority over competitors' models at similar price points—additional benchmarks like HumanEval, GPQA, or math-specific tests would provide clearer differentiation. The claim that it "ranks higher than GPT-4 on chat preferences" requires scrutiny regarding methodology and whether those preference benchmarks correlate with real-world application quality.
Related Articles
Anthropic releases Fable 5, bringing capabilities of restricted Mythos model to public with $10/$50 per 1M token pricing
Anthropic has released Fable 5, making capabilities from its previously restricted Mythos model available to the public. The company claims Fable 5 beats GPT-5.5, Gemini 3.1 Pro, and its own Opus 4.8 in internal testing, with pricing set at $10 per million input tokens and $50 per million output tokens after a free trial period ending June 22.
MiniMax Releases M3: 428B-Parameter Multimodal Model with 1M Context Window and 15× Decode Speedup
MiniMax has released M3, a multimodal model with approximately 428 billion parameters and 23 billion activated parameters. The model supports a 1 million token context window and uses MiniMax Sparse Attention to achieve 9× prefill and 15× decode speedups compared to its predecessor M2.
Anthropic releases Claude Fable 5, first public Mythos-class model at $10/$50 per million tokens
Anthropic has released Claude Fable 5, its first publicly available Mythos-class model, at $10 per million input tokens and $50 per million output tokens—less than half the price of Claude Mythos Preview. The model includes safeguards that redirect sensitive queries to Claude Opus 4.8 in less than 5% of sessions.
Anthropic releases Claude Fable 5 with Mythos-class capabilities at $10/$50 per million tokens
Anthropic released Claude Fable 5, a Mythos-class model, to enterprise customers and paid subscribers two months after limiting its advanced Mythos model to select users. The new model costs $10 per million input tokens and $50 per million output tokens—twice the price of Claude Opus 4.8—and includes safeguards that block responses in high-risk areas like cybersecurity and biology.
Comments
Loading...