Nemotron-3-Nano-Omni-30B-A3B

Name: Nemotron-3-Nano-Omni-30B-A3B
Author: NVIDIA

NVIDIA🇺🇸 United States

active

Compare with other models →

Context window256K tokens

Version History

3-nano-omnimajorApril 28, 2026

First omni-modal release in Nemotron 3 line, adding audio and video capabilities to previous vision-language model. Uses new 30B-A3B MoE architecture with hybrid Mamba-Transformer design.

30b-a3b-reasoningmajorApril 28, 2026

Initial release of Nemotron 3 Nano Omni, a 30B-parameter multimodal MoE model designed as a perception sub-agent for enterprise systems. Features hybrid Transformer-Mamba architecture with specialized video processing and extended reasoning capabilities.

Nemotron-3-Nano-Omni-30B-A3B-ReasoningmajorApril 28, 2026

Initial release of Nemotron-3-Nano-Omni-30B-A3B, a multimodal MoE model with 31B parameters combining video, audio, image, and text understanding with reasoning capabilities.

30B A3BmajorApril 28, 2026

Initial release of Nemotron 3 Nano Omni, a multimodal MoE model with 30B total parameters (3B active) combining video, audio, image, and text understanding in a single inference pass with 131K token context.

Benchmark Scores

Full leaderboard →

46.9%

GPQA

323.0 tokens_per_sec

Speed (tok/s)

Coverage

model releaseNVIDIA

NVIDIA releases Nemotron-3-Nano-Omni-30B, a 31B-parameter multimodal model with 256K context and reasoning mode

NVIDIA released Nemotron-3-Nano-Omni-30B-A3B, a multimodal large language model with 31 billion parameters that processes video, audio, images, and text with up to 256K token context. The model uses a Mamba2-Transformer hybrid Mixture of Experts architecture and supports chain-of-thought reasoning mode.

May 2, 2026 · 9:06 PM2 min read

NVIDIA Nemotron multimodal

model releaseNVIDIA

NVIDIA Nemotron 3 Nano Omni: 30B-parameter multimodal model launches on AWS SageMaker with 131K token context

NVIDIA has launched Nemotron 3 Nano Omni on Amazon SageMaker JumpStart, a multimodal model with 30 billion total parameters (3 billion active) that processes video, audio, images, and text in a single inference pass. The model features a 131K token context window and uses a Mamba2 Transformer Hybrid MoE architecture combining three specialized encoders.

April 28, 2026 · 4:51 PM2 min read

NVIDIA Nemotron multimodal

model releaseNVIDIA

NVIDIA Releases Nemotron 3 Nano Omni: 30B-A3B Multimodal Model With 100+ Page Document Support

NVIDIA released Nemotron 3 Nano Omni, a 30B-A3B Mixture-of-Experts model that processes text, images, video, and audio. The model uses a hybrid Mamba-Transformer architecture with 128 experts and achieves 65.8 on OCRBenchV2-En and 72.2 on Video-MME, while delivering up to 9x higher throughput on multimodal tasks compared to alternatives.

April 28, 2026 · 4:06 PM2 min read

NVIDIA Nemotron multimodal