deprecated

OpenAI's flagship reasoning model with extended chain-of-thought. Strong on math and science.

Context window200K tokens
Input / 1M tokens$15
Output / 1M tokens$60

Version History

o1-2025-09-12minor

o1 September update with improved tool calling reliability and expanded coverage of edge cases in mathematical reasoning.

o1-2024-12-17major

o1 GA release with vision, function calling, and structured outputs. Sets new records on GPQA Diamond and AIME 2024.

Benchmark Scores

Full leaderboard →
83.3%
AIME 2024
83.3%
AIME 2025
78.0%
GPQA
92.4%
HumanEval
72.3%
LiveCodeBench
94.8%
MATH
92.3%
MMLU
75.7%
MMLU-Pro
77.9%
MMMU
22.0 tokens_per_sec
Speed (tok/s)
48.9%
SWE-bench Verified