Grok 4

xAI🇺🇸 United States
active

xAI's most capable model. Major reasoning and multimodal improvements over Grok 3 with an expanded 256K context window.

Context window256K tokens
Input / 1M tokens$3
Output / 1M tokens$15

Version History

grok-4-betamajor

Grok 4 launches as xAI flagship with 256K context and major multimodal capability leap over Grok 3. Claims top-tier performance on coding and reasoning benchmarks.

Benchmark Scores

Full leaderboard →
93.0%
AIME 2024
95.1%
AIME 2025
91.5%
DocVQA
83.0%
GPQA
96.7%
HumanEval
79.3%
LiveCodeBench
98.1%
MATH
94.0%
MMLU
87.0%
MMLU-Pro
79.8%
MMMU
28.0 tokens_per_sec
Speed (tok/s)
81.0%
SWE-bench Verified