model releaseOpenAI

OpenAI releases GPT-4o mini with 128K context at $0.15/$0.60 per 1M tokens

TL;DR

OpenAI released GPT-4o mini on July 18, 2024, a compact multimodal model with 128,000 token context window priced at $0.15 per million input tokens and $0.60 per million output tokens. The model achieves 82% on MMLU and claims to rank higher than GPT-4 on chat preference leaderboards while costing 60% less than GPT-3.5 Turbo.

2 min read
0

GPT-4o mini — Quick Specs

Context window128K tokens
Input$0.15/1M tokens
Output$0.6/1M tokens

OpenAI Releases GPT-4o mini with 128K Context and Aggressive Pricing

OpenAI introduced GPT-4o mini on July 18, 2024, positioning it as the company's most capable small model and direct successor to GPT-3.5 Turbo. The model arrives with significant cost reduction and expanded context handling.

Model Specifications

GPT-4o mini supports multimodal inputs, accepting both text and images while producing text outputs. The model features a 128,000 token context window—a 4x increase over GPT-3.5 Turbo's 32K limit.

Pricing starts at $0.15 per million input tokens and $0.60 per million output tokens. OpenAI claims this represents a 60% cost reduction compared to GPT-3.5 Turbo, making it significantly cheaper than other recent frontier models.

Performance Claims

GPT-4o mini achieves 82% on MMLU, OpenAI's benchmark of choice for measuring broad knowledge. According to the company, the model "presently ranks higher than GPT-4 on chat preferences common leaderboards," though specific leaderboard names and methodologies are not detailed in the launch materials.

OpenAI characterizes GPT-4o mini as maintaining "SOTA intelligence"—state-of-the-art reasoning—while delivering dramatic cost efficiency gains. The model represents a clear positioning strategy: maintain competitive performance on standard benchmarks while underpricing alternatives in the small-to-medium model category.

Market Context

GPT-4o mini arrives as major AI labs compete for developer adoption through aggressive pricing. The model sits between ultra-lightweight models (like GPT-3.5 Turbo) and OpenAI's flagship offerings, addressing the significant market segment where cost sensitivity and capability requirements intersect.

By July 2024, this pricing tier had become increasingly crowded. The aggressive unit economics suggest OpenAI prioritizes market share and API adoption over near-term margin optimization in this segment.

Deployment Status

GPT-4o mini is available through OpenAI's API and multiple third-party providers including OpenRouter, which routes requests across multiple backends for redundancy.

What This Means

GPT-4o mini signals OpenAI's confidence in its ability to scale multimodal models efficiently while maintaining performance parity with flagship systems. The 128K context window and aggressive 60% cost reduction versus GPT-3.5 Turbo create a compelling value proposition for production applications where both capability and cost matter. The MMLU benchmark alone (82%) does not definitively prove superiority over competitors' models at similar price points—additional benchmarks like HumanEval, GPQA, or math-specific tests would provide clearer differentiation. The claim that it "ranks higher than GPT-4 on chat preferences" requires scrutiny regarding methodology and whether those preference benchmarks correlate with real-world application quality.

Related Articles

Comments

Loading...