speech transcription

2 articles tagged with speech transcription

May 2, 2026
model releaseNVIDIA

NVIDIA releases Nemotron-3-Nano-Omni-30B, a 31B-parameter multimodal model with 256K context and reasoning mode

NVIDIA released Nemotron-3-Nano-Omni-30B-A3B, a multimodal large language model with 31 billion parameters that processes video, audio, images, and text with up to 256K token context. The model uses a Mamba2-Transformer hybrid Mixture of Experts architecture and supports chain-of-thought reasoning mode.

April 29, 2026
model releaseNVIDIA+1

NVIDIA Releases Nemotron 3 Nano Omni: 31B-Parameter Multimodal Model with 256K Context and Reasoning Mode

NVIDIA has released Nemotron 3 Nano Omni 30B-A3B, a multimodal large language model with 31 billion parameters using a Mamba2-Transformer hybrid Mixture of Experts architecture. The model supports video, audio, image, and text inputs with a 256K token context window and includes a dedicated reasoning mode with chain-of-thought capabilities.