Nemotron 3 Super

Name: Nemotron 3 Super
Price: 0.1 USD
Author: NVIDIA

NVIDIA🇺🇸 United States

active

Compare with other models →

Context window1000K tokens

Input / 1M tokens$0.1

Output / 1M tokens$0.5

Version History

120bmajorMarch 11, 2026

Nvidia releases Nemotron 3 Super, a 120B hybrid MoE model with 1M context window, latent expert routing, and multi-token prediction. Fully open-weight under NVIDIA Open License.

120B-A12B-NVFP4majorMarch 10, 2026

NVIDIA releases Nemotron-3-Super-120B, a 120B parameter model with latent MoE architecture optimized for conversational tasks across 8 languages.

Coverage

model releaseNVIDIA

Nvidia releases Nemotron 3 Super: 120B MoE model with 1M token context

Nvidia has released Nemotron 3 Super, a 120-billion parameter hybrid Mamba-Transformer Mixture-of-Experts model that activates only 12 billion parameters during inference. The open-weight model features a 1-million token context window, multi-token prediction capabilities, and pricing at $0.10 per million input tokens and $0.50 per million output tokens.

March 23, 2026 · 3:35 PM2 min read

nvidia model-release open-source

model releaseNVIDIA

NVIDIA releases Nemotron-3-Super-120B, a 120B parameter model with latent MoE architecture

NVIDIA has released Nemotron-3-Super-120B-A12B-NVFP4, a 120-billion parameter text generation model featuring a latent Mixture-of-Experts (MoE) architecture. The model supports 8 languages including English, French, Spanish, Italian, German, Japanese, and Chinese, and is available on Hugging Face with 8-bit quantization support through NVIDIA's ModelOpt toolkit.

March 12, 2026 · 11:35 AM2 min read

nvidia model-release 120b-parameters