Grok 4
xAI🇺🇸 United States
xAI's most capable model. Major reasoning and multimodal improvements over Grok 3 with an expanded 256K context window.
Context window256K tokens
Input / 1M tokens$3
Output / 1M tokens$15
Version History
grok-4-betamajor
Grok 4 launches as xAI flagship with 256K context and major multimodal capability leap over Grok 3. Claims top-tier performance on coding and reasoning benchmarks.
Benchmark Scores
Full leaderboard →93.0%
AIME 2024
95.1%
AIME 2025
91.5%
DocVQA
83.0%
GPQA
96.7%
HumanEval
79.3%
LiveCodeBench
98.1%
MATH
94.0%
MMLU
87.0%
MMLU-Pro
79.8%
MMMU
28.0 tokens_per_sec
Speed (tok/s)
81.0%
SWE-bench Verified