Cursor releases Composer 2 at $0.50/$2.50 per 1M tokens, undercutting Claude and GPT-4 on pricing
Cursor released Composer 2, a code-specialized model priced at $0.50 per million input tokens and $2.50 per million output tokens—roughly 90% cheaper than Claude Opus 4.6 ($5.00/$25.00) and 60% cheaper than GPT-5.4 ($2.50/$15.00). The model scores 61.3 on Cursor's internal CursorBench, competitive with Claude Opus 4.6 (58.2) but below GPT-5.4 Thinking (63.9).
Cursor Launches Code-Only Model to Break Pricing Dependency
Cursor released Composer 2, its second-generation code model, priced at $0.50 per million input tokens and $2.50 per million output tokens for the standard version. A faster variant costs $1.50/$7.50. Both versions undercut rival API pricing by substantial margins.
Pricing Comparison
| Model | Input | Output |
|---|---|---|
| Composer 2 | $0.50 | $2.50 |
| Composer 2 Fast | $1.50 | $7.50 |
| Claude Opus 4.6 | $5.00 | $25.00 |
| GPT-5.4 (short context) | $2.50 | $15.00 |
| GPT-5.4 (long context) | $5.00 | $22.50 |
Performance Metrics
Composer 2 scores 61.3 on CursorBench, Cursor's internal coding benchmark—a 38% jump from Composer 1.5 (44.2) and competitive with Claude Opus 4.6 (58.2), though below GPT-5.4 Thinking (63.9).
Additional benchmarks show continued improvement across multiple evaluation frameworks:
| Model | CursorBench | Terminal Bench 2.0 | SWE-bench Multilingual |
|---|---|---|---|
| Composer 2 | 61.3 | 61.7 | 73.7 |
| Composer 1.5 | 44.2 | 47.9 | 65.9 |
| Claude Opus 4.6 | 58.2 | 58.0 | 77.8 |
| GPT-5.4 Thinking | 63.9 | 75.1 | N/A |
Cursor co-founder Aman Sanger told Bloomberg the model was trained exclusively on code data, enabling a smaller, cost-effective architecture. "It won't help you do your taxes. It won't be able to write poems," he said.
Training Approach
Quality gains came from stronger continued pretraining followed by reinforcement learning on long-horizon coding tasks—multi-step programming challenges requiring hundreds of individual actions. This approach drove the significant benchmark improvements over Composer 1.5 and Composer 1 (38.0 on CursorBench).
Strategic Necessity for Cursor
Building its own model addresses a structural dilemma: Cursor competes directly with Anthropic and OpenAI while depending on their APIs. As long as Cursor purchases third-party models, it faces pricing constraints its competitors don't—Anthropic and OpenAI can heavily subsidize their own products.
Cursor reportedly estimates a single Claude Code subscription at $200/month generates approximately $5,000 in compute costs for Anthropic. Consumer subscriptions at Cursor currently run at negative margins, with enterprise contracts providing profitability.
With over 1 million daily users and 50,000 enterprise customers, Cursor is discussing funding at a ~$50 billion valuation. As AI coding agents improve, the risk persists that users could bypass the IDE entirely and work directly with model providers—making Composer 2 essential to Cursor's long-term independence.
What This Means
Composer 2 represents a deliberate shift toward self-sufficiency. Cursor's pricing advantage is real but performance remains competitive rather than dominant. The code-only approach is pragmatic: narrower focus enables cheaper training and faster inference. Cursor's bet hinges on whether pricing and adequate performance can retain users against providers with deeper resources and broader models. The benchmark gap with GPT-5.4 Thinking suggests room for improvement, but SWE-bench performance (73.7) demonstrates practical engineering capability.
Related Articles
Mistral Releases Medium 3.5: 128B Dense Model With 256k Context and Configurable Reasoning
Mistral AI released Mistral Medium 3.5, a 128B parameter dense model with a 256k context window that unifies instruction-following, reasoning, and coding capabilities. The model features configurable reasoning effort per request and a vision encoder trained from scratch for variable image sizes.
Poolside releases Laguna XS.2: 33B parameter MoE coding model with 131K context window
Poolside has released Laguna XS.2, a 33B total parameter Mixture-of-Experts model with 3B activated parameters per token, designed for agentic coding. The model features a 131,072-token context window, scores 68.2% on SWE-bench Verified, and is available under Apache 2.0 license with free API access.
NVIDIA releases Nemotron-3-Nano-Omni-30B, a 31B-parameter multimodal model with 256K context and reasoning mode
NVIDIA released Nemotron-3-Nano-Omni-30B-A3B, a multimodal large language model with 31 billion parameters that processes video, audio, images, and text with up to 256K token context. The model uses a Mamba2-Transformer hybrid Mixture of Experts architecture and supports chain-of-thought reasoning mode.
IBM Releases Granite 4.1 30B With 131K Context Window and Enhanced Tool-Calling
IBM released Granite 4.1 30B, a 30-billion parameter instruction-following model with a 131,072 token context window. The model scores 80.16 on MMLU 5-shot and 88.41 on HumanEval pass@1, with enhanced tool-calling capabilities following OpenAI's function definition schema.
Comments
Loading...