AWS brings NVIDIA Nemotron and OpenAI GPT OSS models to GovCloud for secure government AI workloads
Amazon Bedrock now supports NVIDIA Nemotron and OpenAI GPT OSS models in AWS GovCloud (US) Regions. The launch includes OpenAI's GPT OSS models (120B and 20B parameters, 128K context) and NVIDIA Nemotron 3 family (9B to 120B parameters, 1M context), providing government agencies FedRAMP High and DoD SRG Level 5-compliant AI inference on U.S. soil.
AWS brings NVIDIA Nemotron and OpenAI GPT OSS models to GovCloud for secure government AI workloads
Amazon Web Services expanded Amazon Bedrock's model selection in AWS GovCloud (US) with NVIDIA Nemotron and OpenAI GPT OSS models, providing government agencies access to frontier open-weight models within FedRAMP High and DoD-compliant infrastructure.
Models and specifications
The launch includes two model families. OpenAI's GPT OSS models consist of a 120-billion parameter variant for production and high-reasoning tasks, and a 20-billion parameter model optimized for lower latency. Both provide 128K-token context windows and generate up to 16K output tokens.
NVIDIA Nemotron 3 family includes four variants:
- Nemotron 3 Super 120B: 120 billion total parameters with mixture-of-experts architecture activating 12 billion parameters per token, claims 5x higher throughput than previous generation
- Nemotron 3 Nano 30B: 30 billion parameters activating approximately 3 billion per token, claims 4x higher throughput
- Nemotron 3 Nano 12B v2 and Nano 9B v2: smaller variants for efficient deployment
All Nemotron models support 1-million-token context windows.
Compliance and infrastructure
The models run on Amazon Bedrock's next-generation inference engine with zero operator access architecture. According to AWS, no operator from AWS, customers, or model providers can access inference prompts or completions. The infrastructure operates exclusively on U.S. soil, staffed by U.S. citizens.
AWS GovCloud (US) supports compliance frameworks including:
- FedRAMP High (Provisional Authority to Operate)
- DoD Cloud Computing Security Requirements Guide Impact Levels 2, 4, and 5
- International Traffic in Arms Regulations (ITAR)
- Criminal Justice Information Services (CJIS)
Deployment options
In-Region inference is available in us-gov-west-1 (AWS GovCloud US-West), keeping all requests within a single Region. Geographic Cross-Region inference routes requests across AWS GovCloud (US) Regions for higher throughput while maintaining data residency within the GovCloud boundary.
The service provides two API endpoints: bedrock-mantle offers OpenAI-compatible API access through the Chat Completions format, while bedrock-runtime uses AWS native Converse and InvokeModel APIs with Amazon Bedrock Guardrails integration.
Pricing and availability
AWS did not disclose pricing per million tokens. The models are available now in AWS GovCloud (US) Regions through Amazon Bedrock's serverless inference, requiring no GPU provisioning or infrastructure management.
What this means
This marks the first availability of OpenAI's open-weight models and NVIDIA's latest Nemotron family in a government-dedicated cloud environment. For defense and intelligence agencies, the combination of open-weight transparency, 1M+ token context windows, and FedRAMP High compliance enables AI deployment in classified environments previously restricted to on-premises systems. The zero operator access architecture addresses data sovereignty requirements while the serverless model eliminates the GPU procurement bottleneck that has slowed government AI adoption. Agencies now have a compliance-vetted path to deploy agentic workflows for intelligence synthesis, security log analysis, and contract review without custom infrastructure.
Related Articles
AWS adds metadata filtering to AgentCore Memory, improving agent retrieval accuracy from 40% to 64%
Amazon has added metadata filtering to its AgentCore Memory service for AI agents. In AWS evaluations across 151 questions, the feature improved overall question-answering accuracy from 40% to 64%, with context-dependent questions jumping from 16% to 69% accuracy. The update allows agents to filter memory retrieval by attributes like priority, department, or time range before semantic search runs.
AWS to Release Anthropic's Claude Fable 5 on Bedrock with Cybersecurity Guardrails
Amazon Web Services announced it will make Anthropic's Claude Fable 5 models available on Bedrock starting tomorrow, featuring guardrails designed to prevent cybersecurity misuse. When guardrails are triggered, the system automatically falls back to Claude Opus 4.8.
AWS launches managed entitlements for Bedrock to distribute third-party model access across multi-account organizations
AWS has introduced managed entitlements for Amazon Bedrock, allowing organizations to subscribe to third-party models like Anthropic Claude and Cohere from a central account and distribute access across member accounts without requiring AWS Marketplace permissions. The feature uses AWS License Manager to create grants that share model entitlements with specific accounts or entire organizational units.
AWS enables fine-tuning of Amazon Nova models for email extraction, achieving 94.77% accuracy with 50% cost reduction
AWS released guidance on fine-tuning Amazon Nova Micro and Nova Lite models for automated email data extraction using SageMaker AI. In collaboration with Parcel Perform, the fine-tuned Nova Micro achieved 94.77% extraction accuracy—a 16.6 percentage point improvement—while reducing inference costs by 50% and latency by 30% compared to previous models.
Comments
Loading...