OpenAI releases GPT-4o mini with 128K context at $0.15/$0.60 per 1M tokens
OpenAI released GPT-4o mini on July 18, 2024, a compact multimodal model with 128,000 token context window priced at $0.15 per million input tokens and $0.60 per million output tokens. The model achieves 82% on MMLU and claims to rank higher than GPT-4 on chat preference leaderboards while costing 60% less than GPT-3.5 Turbo.
GPT-4o mini — Quick Specs
OpenAI Releases GPT-4o mini with 128K Context and Aggressive Pricing
OpenAI introduced GPT-4o mini on July 18, 2024, positioning it as the company's most capable small model and direct successor to GPT-3.5 Turbo. The model arrives with significant cost reduction and expanded context handling.
Model Specifications
GPT-4o mini supports multimodal inputs, accepting both text and images while producing text outputs. The model features a 128,000 token context window—a 4x increase over GPT-3.5 Turbo's 32K limit.
Pricing starts at $0.15 per million input tokens and $0.60 per million output tokens. OpenAI claims this represents a 60% cost reduction compared to GPT-3.5 Turbo, making it significantly cheaper than other recent frontier models.
Performance Claims
GPT-4o mini achieves 82% on MMLU, OpenAI's benchmark of choice for measuring broad knowledge. According to the company, the model "presently ranks higher than GPT-4 on chat preferences common leaderboards," though specific leaderboard names and methodologies are not detailed in the launch materials.
OpenAI characterizes GPT-4o mini as maintaining "SOTA intelligence"—state-of-the-art reasoning—while delivering dramatic cost efficiency gains. The model represents a clear positioning strategy: maintain competitive performance on standard benchmarks while underpricing alternatives in the small-to-medium model category.
Market Context
GPT-4o mini arrives as major AI labs compete for developer adoption through aggressive pricing. The model sits between ultra-lightweight models (like GPT-3.5 Turbo) and OpenAI's flagship offerings, addressing the significant market segment where cost sensitivity and capability requirements intersect.
By July 2024, this pricing tier had become increasingly crowded. The aggressive unit economics suggest OpenAI prioritizes market share and API adoption over near-term margin optimization in this segment.
Deployment Status
GPT-4o mini is available through OpenAI's API and multiple third-party providers including OpenRouter, which routes requests across multiple backends for redundancy.
What This Means
GPT-4o mini signals OpenAI's confidence in its ability to scale multimodal models efficiently while maintaining performance parity with flagship systems. The 128K context window and aggressive 60% cost reduction versus GPT-3.5 Turbo create a compelling value proposition for production applications where both capability and cost matter. The MMLU benchmark alone (82%) does not definitively prove superiority over competitors' models at similar price points—additional benchmarks like HumanEval, GPQA, or math-specific tests would provide clearer differentiation. The claim that it "ranks higher than GPT-4 on chat preferences" requires scrutiny regarding methodology and whether those preference benchmarks correlate with real-world application quality.
Related Articles
Mistral Releases Medium 3.5: 128B Dense Model With 256k Context and Configurable Reasoning
Mistral AI released Mistral Medium 3.5, a 128B parameter dense model with a 256k context window that unifies instruction-following, reasoning, and coding capabilities. The model features configurable reasoning effort per request and a vision encoder trained from scratch for variable image sizes.
NVIDIA releases Nemotron-3-Nano-Omni-30B, a 31B-parameter multimodal model with 256K context and reasoning mode
NVIDIA released Nemotron-3-Nano-Omni-30B-A3B, a multimodal large language model with 31 billion parameters that processes video, audio, images, and text with up to 256K token context. The model uses a Mamba2-Transformer hybrid Mixture of Experts architecture and supports chain-of-thought reasoning mode.
OpenAI adds AI-generated pet overlays to Codex coding assistant
OpenAI added optional AI-generated pet companions to its Codex coding assistant. The floating overlays notify developers when Codex completes tasks or needs input, eliminating the need to switch windows to check status.
OpenAI adds Tamagotchi-style pets to Codex Mac app with custom creation feature
OpenAI added a /pet feature to its Codex Mac app that displays animated companion creatures in the interface. Users can choose from preset options or create custom pets that provide status updates while Codex runs in the background.
Comments
Loading...