Nemotron 3 Nano Omni

NVIDIA🇺🇸 United States
active
Context window256K tokens
00

Version History

3-nano-omnimajor

First omni-modal release in Nemotron 3 line, adding audio and video capabilities to previous vision-language model. Uses new 30B-A3B MoE architecture with hybrid Mamba-Transformer design.

30b-a3b-reasoningmajor

Initial release of Nemotron 3 Nano Omni, a 30B-parameter multimodal MoE model designed as a perception sub-agent for enterprise systems. Features hybrid Transformer-Mamba architecture with specialized video processing and extended reasoning capabilities.

Coverage

model releaseNVIDIA

NVIDIA Releases Nemotron 3 Nano Omni: 30B-A3B Multimodal Model With 100+ Page Document Support

NVIDIA released Nemotron 3 Nano Omni, a 30B-A3B Mixture-of-Experts model that processes text, images, video, and audio. The model uses a hybrid Mamba-Transformer architecture with 128 experts and achieves 65.8 on OCRBenchV2-En and 72.2 on Video-MME, while delivering up to 9x higher throughput on multimodal tasks compared to alternatives.

2 min read