NVIDIA Nemotron-3-Super-120B-A12B

NVIDIA🇺🇸 United States
active
Context window1000K tokens
Input / 1M tokens$0.2
Output / 1M tokens$0.2

Version History

A12B-BF16major

NVIDIA releases Nemotron-3-Super-120B-A12B-BF16, a 120 billion parameter model with latent MoE architecture for efficient text generation across 8 languages.

Benchmark Scores

Full leaderboard →
90.0%
AIME 2025
79.2%
GPQA
83.7%
MMLU-Pro
60.5%
SWE-bench Verified

Coverage