NVIDIA

24 articles tagged with NVIDIA

July 10, 2026

product updateNVIDIA

AWS Adds NVIDIA Nemotron 3 Nano (30B) and Super (120B) to SageMaker Serverless Fine-Tuning

Amazon SageMaker AI now supports serverless fine-tuning for NVIDIA Nemotron 3 Nano (30B parameters, 3B active) and Nemotron 3 Super (120B parameters, 12B active). The integration includes supervised fine-tuning, reinforcement learning with verifiable rewards (RLVR), and reinforcement learning from AI feedback (RLAIF).

July 10, 2026 · 3:50 PM

July 9, 2026

model releaseNVIDIA

NVIDIA releases Nemotron-Labs-3-Puzzle-75B, compressed from 120B to 75B parameters with 2× throughput

NVIDIA has released Nemotron-Labs-3-Puzzle-75B-A9B, a compressed variant of Nemotron-3-Super that reduces the model from 120.7B total/12.8B active parameters to 75.3B total/9.3B active parameters. According to NVIDIA, the model achieves approximately 2× higher server throughput on a single 8×B200 node and increases sustainable 1M-token single-H100 concurrency from 1 request to 8 requests while maintaining strong accuracy across benchmarks.

July 9, 2026 · 5:06 AM

model releaseNVIDIA

NVIDIA Releases Audex-30B-A3B: Unified Audio-Text Model With 1M Token Context and Speech Generation

NVIDIA released Audex-30B-A3B, a unified audio-text model built on the Nemotron-Cascade-2-30B-A3B backbone. The model handles audio understanding, speech recognition and translation, text-to-speech, audio generation, and speech-to-speech while supporting up to 1M token context length.

July 9, 2026 · 3:21 AM

July 4, 2026

model releaseNVIDIA

NVIDIA releases Nemotron-Labs-TwoTower-30B: block-wise diffusion model claims 2.42× faster generation at 98.7% baseline

NVIDIA released Nemotron-Labs-TwoTower-30B-A3B-Base-BF16, a block-wise diffusion language model that generates text by denoising blocks of tokens in parallel rather than sequentially. According to NVIDIA, the model achieves 2.42× the wall-clock generation throughput of its autoregressive baseline while retaining 98.7% of aggregate benchmark quality.

July 4, 2026 · 7:51 AM

July 2, 2026

product updateAnthropic

Anthropic launches Claude Science beta with NVIDIA BioNeMo integration for life sciences research

Anthropic has launched the public beta of Claude Science, an AI workbench for scientific research that integrates NVIDIA's BioNeMo Agent Toolkit. The platform allows scientists to execute end-to-end research workflows using natural language commands to interact with digital agents.

July 2, 2026 · 2:50 PM

July 1, 2026

product updateNVIDIA

AWS brings NVIDIA Nemotron and OpenAI GPT OSS models to GovCloud for secure government AI workloads

Amazon Bedrock now supports NVIDIA Nemotron and OpenAI GPT OSS models in AWS GovCloud (US) Regions. The launch includes OpenAI's GPT OSS models (120B and 20B parameters, 128K context) and NVIDIA Nemotron 3 family (9B to 120B parameters, 1M context), providing government agencies FedRAMP High and DoD SRG Level 5-compliant AI inference on U.S. soil.

July 1, 2026 · 6:21 PM

June 17, 2026

model releaseGoogle DeepMind

NVIDIA Releases Quantized DiffusionGemma 26B: 1,100+ Tokens/Second with 256K Context Window

NVIDIA released a quantized version of Google DeepMind's DiffusionGemma 26B A4B IT, a multimodal model with 25.2B total parameters (3.8B active) that processes text, image, and video inputs. The NVFP4-quantized model achieves generation speeds exceeding 1,100 tokens per second on NVIDIA H100 GPUs while supporting a 256K token context window.

June 17, 2026 · 12:06 PM

June 4, 2026

model releaseNVIDIA

NVIDIA Releases Nemotron 3.5 Content Safety: 4B-Parameter Multimodal Model with Custom Policy Enforcement and 140-Langua

NVIDIA has released Nemotron 3.5 Content Safety, a 4B-parameter model built on Google Gemma 3 4B IT that provides multimodal safety classification across approximately 140 languages. The model includes a 128K context window, custom enterprise policy enforcement, auditable reasoning traces, and is releasing its training dataset.

June 4, 2026 · 7:06 PM

model releaseNVIDIA

NVIDIA Nemotron 3 Ultra launches on AWS SageMaker with 550B parameters, 1M token context window

NVIDIA Nemotron 3 Ultra is now available on Amazon SageMaker JumpStart with 550 billion total parameters and 55 billion active parameters. The model features a hybrid Transformer-Mamba Mixture-of-Experts architecture and supports context windows up to 1 million tokens, targeting agentic AI workloads.

June 4, 2026 · 5:06 PM

model releaseNVIDIA

NVIDIA Releases Nemotron 3.5 ASR: 600M-Parameter Streaming Speech Model for 40 Languages

NVIDIA released Nemotron 3.5 ASR, a 600M-parameter speech-to-text model supporting 40 language-locales from a single checkpoint. The model achieves 0.07 seconds to final transcript after speech ends and ranks 2nd in latency among streaming ASR models according to Artificial Analysis benchmarks.

June 4, 2026 · 1:06 PM

June 2, 2026

model releaseNVIDIA

NVIDIA Releases Cosmos3-Super-Text2Image: 64B Parameter Model for Physical AI Applications

NVIDIA released Cosmos3-Super-Text2Image, a 64-billion parameter text-to-image generation model as part of its Cosmos3 collection of omnimodal world models. The model uses a Mixture-of-Transformers architecture combining autoregressive and diffusion transformers, designed for Physical AI applications including robotics and autonomous vehicles.

June 2, 2026 · 5:51 PM

May 22, 2026

model releaseNVIDIA

NVIDIA releases Nemotron-Labs-Diffusion-14B with tri-mode decoding achieving 3.3x speed-up on GB200

NVIDIA released Nemotron-Labs-Diffusion-14B, a 14-billion parameter language model that supports three decoding modes by switching attention patterns during inference. The model achieves 850 tokens per second on GB200 hardware at concurrency 1, representing a 3.3x speed-up over standard autoregressive decoding and outperforming Qwen3-8B-Eagle3 by 2.2x in self-speculation mode.

May 22, 2026 · 6:51 PM

May 2, 2026

model releaseNVIDIA

NVIDIA releases Nemotron-3-Nano-Omni-30B, a 31B-parameter multimodal model with 256K context and reasoning mode

NVIDIA released Nemotron-3-Nano-Omni-30B-A3B, a multimodal large language model with 31 billion parameters that processes video, audio, images, and text with up to 256K token context. The model uses a Mamba2-Transformer hybrid Mixture of Experts architecture and supports chain-of-thought reasoning mode.

May 2, 2026 · 9:06 PM

April 29, 2026

model releaseNVIDIA

NVIDIA Releases Nemotron 3 Nano Omni: 31B Multimodal Model With 256K Context and Reasoning Mode

NVIDIA released Nemotron 3 Nano Omni, a 31B parameter (30B active, 3B per token) multimodal model supporting video, audio, image, and text inputs. The model features a 256K token context window, reasoning mode with chain-of-thought, and tool calling capabilities.

April 29, 2026 · 5:36 PM

model releaseOpenAI

OpenAI releases GPT-5.5 with 82.7% Terminal-Bench score, API priced at $5/$30 per million tokens

OpenAI released GPT-5.5 on April 23, its first retrained base model since GPT-4.5, scoring 82.7% on Terminal-Bench 2.0 versus GPT-5.4's 75.1% and Claude Opus 4.7's 69.4%. API pricing is set at $5 per million input tokens and $30 per million output tokens, exactly double GPT-5.4 rates.

April 29, 2026 · 9:21 AM

model releaseNVIDIA+1

NVIDIA Releases Nemotron 3 Nano Omni: 31B-Parameter Multimodal Model with 256K Context and Reasoning Mode

NVIDIA has released Nemotron 3 Nano Omni 30B-A3B, a multimodal large language model with 31 billion parameters using a Mamba2-Transformer hybrid Mixture of Experts architecture. The model supports video, audio, image, and text inputs with a 256K token context window and includes a dedicated reasoning mode with chain-of-thought capabilities.

April 29, 2026 · 2:51 AM

April 28, 2026

model releaseNVIDIA

NVIDIA Nemotron 3 Nano Omni: 30B-parameter multimodal model launches on AWS SageMaker with 131K token context

NVIDIA has launched Nemotron 3 Nano Omni on Amazon SageMaker JumpStart, a multimodal model with 30 billion total parameters (3 billion active) that processes video, audio, images, and text in a single inference pass. The model features a 131K token context window and uses a Mamba2 Transformer Hybrid MoE architecture combining three specialized encoders.

April 28, 2026 · 4:51 PM

model releaseNVIDIA

NVIDIA Releases Nemotron 3 Nano Omni: 30B-A3B Multimodal Model With 100+ Page Document Support

NVIDIA released Nemotron 3 Nano Omni, a 30B-A3B Mixture-of-Experts model that processes text, images, video, and audio. The model uses a hybrid Mamba-Transformer architecture with 128 experts and achieves 65.8 on OCRBenchV2-En and 72.2 on Video-MME, while delivering up to 9x higher throughput on multimodal tasks compared to alternatives.

April 28, 2026 · 4:06 PM

April 23, 2026

model releaseOpenAI

OpenAI GPT-5.5 Powers Codex Coding Agent on NVIDIA GB200 Infrastructure

OpenAI has released GPT-5.5, its latest frontier model, according to NVIDIA. The model powers Codex, OpenAI's agentic coding application, running on NVIDIA GB200 NVL72 rack-scale systems.

April 23, 2026 · 7:05 PM

April 22, 2026

model release

Gemma 4 VLA runs locally on NVIDIA Jetson Orin Nano Super with 8GB RAM, autonomous webcam tool-calling

NVIDIA engineer Asier Arranz demonstrated Gemma 4 running as a vision-language agent (VLA) on a Jetson Orin Nano Super with 8GB RAM. The model autonomously decides when to access a webcam based on user queries, with no hardcoded triggers—performing speech-to-text, vision analysis, and text-to-speech entirely locally.

April 22, 2026 · 3:51 PM

April 21, 2026

product updateNVIDIA

NVIDIA Releases 7 Million Synthetic Korean Personas Dataset for AI Agent Localization

NVIDIA released Nemotron-Personas-Korea, a dataset containing 7 million demographically accurate synthetic personas grounded in official Korean statistics from KOSIS, Supreme Court of Korea, and the National Health Insurance Service. The dataset includes 26 fields per persona covering demographics, geography, and occupation across all 17 Korean provinces, with zero personally identifiable information under CC BY 4.0 license.

April 21, 2026 · 12:51 AM

April 17, 2026

model releaseNVIDIA

NVIDIA Releases GR00T N1.7, 3B-Parameter Open-Source Humanoid Robot Model Trained on 20,854 Hours of Human Video

NVIDIA released GR00T N1.7, a 3-billion parameter open-source Vision-Language-Action model for humanoid robots with commercial licensing. The model was trained on 20,854 hours of human egocentric video data and demonstrates the first documented scaling law for robot dexterity, where increasing human video data from 1,000 to 20,000 hours more than doubles task completion rates.

April 17, 2026 · 3:51 PM

March 23, 2026

model releaseNVIDIA+1

NVIDIA releases Nemotron-3-Nano-4B, a 4B parameter model for edge AI with 262K context window

NVIDIA released Nemotron-3-Nano-4B-GGUF on March 16, 2026, a 4-billion parameter small language model (SLM) designed for edge deployment on devices like Jetson Thor and GeForce RTX. The model features a hybrid Mamba-2 and Transformer architecture with a 262K token context window and supports both reasoning and non-reasoning modes via system prompts.

March 23, 2026 · 3:36 PM

March 10, 2026

product update

ABB and NVIDIA partnership shows physical AI simulation driving factory automation ROI

ABB and NVIDIA have partnered to deploy physical AI simulation in factory automation, addressing the critical sim-to-real gap that has limited intelligent robotics deployment. The approach uses digital physics simulation to train models that transfer reliably to actual factory floors, reducing production hurdles and securing measurable ROI.

March 10, 2026 · 5:36 PM

← Back to all news