Nemotron-3-Ultra-550B-A55B

Name: Nemotron-3-Ultra-550B-A55B
Author: NVIDIA

NVIDIA🇺🇸 United States

active

Compare with other models →

Context window1000K tokens

Version History

550B-A55B-BF16majorJune 4, 2026

Initial release of Nemotron-3-Ultra, a 550B parameter model trained December 2025-April 2026 with hybrid LatentMoE architecture, 1M token context, and configurable reasoning capabilities.

Benchmark Scores

Full leaderboard →

87.0%

GPQA

86.8%

MMLU-Pro

71.9%

SWE-bench Verified

Coverage

model releaseNVIDIA

NVIDIA releases Nemotron-3-Ultra: 550B parameter model with 1M token context and configurable reasoning

NVIDIA released Nemotron-3-Ultra-550B, a frontier-scale model with 550B total parameters (55B active) and up to 1M token context window. The model uses a hybrid LatentMoE architecture combining Mamba-2, MoE, and attention layers with Multi-Token Prediction, trained with NVFP4 quantization-aware methods from December 2025 to April 2026.

June 5, 2026 · 4:51 AM2 min read

nvidia nemotron moe