product updateNVIDIA

AWS brings NVIDIA Nemotron and OpenAI GPT OSS models to GovCloud for secure government AI workloads

TL;DR

Amazon Bedrock now supports NVIDIA Nemotron and OpenAI GPT OSS models in AWS GovCloud (US) Regions. The launch includes OpenAI's GPT OSS models (120B and 20B parameters, 128K context) and NVIDIA Nemotron 3 family (9B to 120B parameters, 1M context), providing government agencies FedRAMP High and DoD SRG Level 5-compliant AI inference on U.S. soil.

July 1, 2026 · 6:21 PM2 min read

AWS brings NVIDIA Nemotron and OpenAI GPT OSS models to GovCloud for secure government AI workloads

Amazon Web Services expanded Amazon Bedrock's model selection in AWS GovCloud (US) with NVIDIA Nemotron and OpenAI GPT OSS models, providing government agencies access to frontier open-weight models within FedRAMP High and DoD-compliant infrastructure.

Models and specifications

The launch includes two model families. OpenAI's GPT OSS models consist of a 120-billion parameter variant for production and high-reasoning tasks, and a 20-billion parameter model optimized for lower latency. Both provide 128K-token context windows and generate up to 16K output tokens.

NVIDIA Nemotron 3 family includes four variants:

Nemotron 3 Super 120B: 120 billion total parameters with mixture-of-experts architecture activating 12 billion parameters per token, claims 5x higher throughput than previous generation
Nemotron 3 Nano 30B: 30 billion parameters activating approximately 3 billion per token, claims 4x higher throughput
Nemotron 3 Nano 12B v2 and Nano 9B v2: smaller variants for efficient deployment

All Nemotron models support 1-million-token context windows.

Compliance and infrastructure

The models run on Amazon Bedrock's next-generation inference engine with zero operator access architecture. According to AWS, no operator from AWS, customers, or model providers can access inference prompts or completions. The infrastructure operates exclusively on U.S. soil, staffed by U.S. citizens.

AWS GovCloud (US) supports compliance frameworks including:

FedRAMP High (Provisional Authority to Operate)
DoD Cloud Computing Security Requirements Guide Impact Levels 2, 4, and 5
International Traffic in Arms Regulations (ITAR)
Criminal Justice Information Services (CJIS)

Deployment options

In-Region inference is available in us-gov-west-1 (AWS GovCloud US-West), keeping all requests within a single Region. Geographic Cross-Region inference routes requests across AWS GovCloud (US) Regions for higher throughput while maintaining data residency within the GovCloud boundary.

The service provides two API endpoints: bedrock-mantle offers OpenAI-compatible API access through the Chat Completions format, while bedrock-runtime uses AWS native Converse and InvokeModel APIs with Amazon Bedrock Guardrails integration.

Pricing and availability

AWS did not disclose pricing per million tokens. The models are available now in AWS GovCloud (US) Regions through Amazon Bedrock's serverless inference, requiring no GPU provisioning or infrastructure management.

What this means

This marks the first availability of OpenAI's open-weight models and NVIDIA's latest Nemotron family in a government-dedicated cloud environment. For defense and intelligence agencies, the combination of open-weight transparency, 1M+ token context windows, and FedRAMP High compliance enables AI deployment in classified environments previously restricted to on-premises systems. The zero operator access architecture addresses data sovereignty requirements while the serverless model eliminates the GPU procurement bottleneck that has slowed government AI adoption. Agencies now have a compliance-vetted path to deploy agentic workflows for intelligence synthesis, security log analysis, and contract review without custom infrastructure.

Source: aws.amazon.com ↗

AWS Amazon Bedrock NVIDIA OpenAI GovCloud Government AI FedRAMP DoD

product updateJuly 1, 2026

AWS adds metadata filtering to AgentCore Memory, improving agent retrieval accuracy from 40% to 64%

Amazon has added metadata filtering to its AgentCore Memory service for AI agents. In AWS evaluations across 151 questions, the feature improved overall question-answering accuracy from 40% to 64%, with context-dependent questions jumping from 16% to 69% accuracy. The update allows agents to filter memory retrieval by attributes like priority, department, or time range before semantic search runs.

product updateJuly 1, 2026

AWS to Release Anthropic's Claude Fable 5 on Bedrock with Cybersecurity Guardrails

Amazon Web Services announced it will make Anthropic's Claude Fable 5 models available on Bedrock starting tomorrow, featuring guardrails designed to prevent cybersecurity misuse. When guardrails are triggered, the system automatically falls back to Claude Opus 4.8.

product updateJune 30, 2026

AWS launches managed entitlements for Bedrock to distribute third-party model access across multi-account organizations

AWS has introduced managed entitlements for Amazon Bedrock, allowing organizations to subscribe to third-party models like Anthropic Claude and Cohere from a central account and distribute access across member accounts without requiring AWS Marketplace permissions. The feature uses AWS License Manager to create grants that share model entitlements with specific accounts or entire organizational units.

product updateJune 30, 2026

AWS enables fine-tuning of Amazon Nova models for email extraction, achieving 94.77% accuracy with 50% cost reduction

AWS released guidance on fine-tuning Amazon Nova Micro and Nova Lite models for automated email data extraction using SageMaker AI. In collaboration with Parcel Perform, the fine-tuned Nova Micro achieved 94.77% extraction accuracy—a 16.6 percentage point improvement—while reducing inference costs by 50% and latency by 30% compared to previous models.

AWS brings NVIDIA Nemotron and OpenAI GPT OSS models to GovCloud for secure government AI workloads

AWS brings NVIDIA Nemotron and OpenAI GPT OSS models to GovCloud for secure government AI workloads

Models and specifications

Compliance and infrastructure

Deployment options

Pricing and availability

What this means

Related Articles

AWS adds metadata filtering to AgentCore Memory, improving agent retrieval accuracy from 40% to 64%

AWS to Release Anthropic's Claude Fable 5 on Bedrock with Cybersecurity Guardrails

AWS launches managed entitlements for Bedrock to distribute third-party model access across multi-account organizations

AWS enables fine-tuning of Amazon Nova models for email extraction, achieving 94.77% accuracy with 50% cost reduction

Comments