AWS launches hyperparameter optimization guide for Amazon Nova Forge custom model training

TL;DR

AWS has published a technical guide on hyperparameter optimization for Amazon Nova Forge, its platform for building custom frontier models from Amazon Nova checkpoints. The guide addresses three core challenges: catastrophic forgetting during domain specialization, learning rate calibration when mixing proprietary and curated training data, and baseline performance constraints for reinforcement fine-tuning.

June 2, 2026 · 5:51 PM2 min read

AWS launches hyperparameter optimization guide for Amazon Nova Forge custom model training

Key capabilities and training pipeline

Amazon Nova Forge enables organizations to customize Amazon Nova models using three complementary techniques:

Continued pre-training (CPT) expands model knowledge through self-supervised learning on unlabeled domain text. Nova Forge offers three checkpoint options for CPT: pre-trained, mid-trained, and post-trained, each suited to different data scales and downstream requirements.

Supervised fine-tuning (SFT) customizes model behavior using 1,000–10,000 input-output demonstration pairs per task. According to AWS, quality and consistency matter more than volume. SFT with data mixing uses Amazon Nova-curated datasets in reasoning and instruction-following categories to preserve general capabilities.

Reinforcement fine-tuning (RFT) optimizes model outputs using reward signals. Nova Forge supports custom verification logic through AWS Lambda integration, enabling domain-specific quality assessment for single-turn or multi-turn conversational tasks.

Critical hyperparameter challenges

The guide identifies learning rate as the most sensitive hyperparameter across all customization techniques. According to AWS, deviating from service defaults when mixing Nova data with proprietary data is the most common source of training instability.

Data mixing ratios require careful calibration. The technique blends proprietary training data with Amazon Nova-curated datasets to prevent catastrophic forgetting—when models lose general capabilities after training on narrow domain data. AWS states that models can lose instruction-following ability, reasoning capability, and broad knowledge when this balance is incorrect.

RFT works within specific baseline accuracy ranges. If baseline accuracy is too low, insufficient correct responses exist for reward-guided learning. If baseline accuracy is already high, additional training yields diminishing returns. AWS recommends running SFT first for low-baseline scenarios to establish foundational capabilities before RFT.

Training configuration

Nova Forge provides calibrated service defaults for each training technique that account for interactions between data distribution, mixing ratio, and training method. The platform supports secure hosting of custom models on AWS infrastructure.

Checkpoint selection determines how much existing alignment to preserve during customization. Combined with data mixing, this addresses the stability-flexibility tradeoff between learning organizational domain knowledge and retaining general model capabilities.

What this means

This guide signals AWS's positioning of Nova Forge as an enterprise-focused alternative to training custom models from scratch or using general-purpose fine-tuning APIs. The technical depth on catastrophic forgetting and learning rate sensitivity addresses real production failures that occur when organizations customize foundation models without adequate guardrails. The emphasis on service defaults and calibrated parameters suggests AWS is productizing lessons from internal model development to reduce trial-and-error costs for enterprise customers. Organizations evaluating custom model development now have documented approaches for the specific hyperparameter interactions that determine whether domain specialization succeeds or wastes compute.

Source: aws.amazon.com ↗

amazon-nova aws model-training hyperparameter-tuning fine-tuning enterprise-ai

product updateJuly 16, 2026

AWS launches Managed Knowledge Base for Bedrock with 6 enterprise connectors and automatic ACL enforcement

Amazon Web Services launched Managed Knowledge Base for Bedrock in general availability, offering a fully managed retrieval solution with six native enterprise connectors including SharePoint, Confluence, and Google Drive. The service handles document parsing up to 500 MB for PDFs, 2 GB for audio, and 10 GB for video, with real-time access control list verification at query time.

product updateJuly 16, 2026

xAI's Grok 4.3 now available on AWS Bedrock with 1M token context and configurable reasoning

xAI has made Grok 4.3 generally available on Amazon Bedrock, marking xAI's debut as a Bedrock model provider. The multimodal model offers a 1 million token context window, configurable reasoning effort (none/low/medium/high), and runs on Bedrock's Mantle inference engine using OpenAI-compatible APIs.

product updateJuly 16, 2026

AWS launches AgentCore platform for building voice AI agents with Amazon Nova 2 Sonic

AWS has released AgentCore, a new platform for hosting and running voice-based AI agents, integrated with Amazon Nova 2 Sonic for real-time speech capabilities. The platform uses the open Model Context Protocol (MCP) to connect agents to backend systems and deploys each conversation in isolated microVMs.

product updateJuly 14, 2026

AWS Extends QA Studio with Test Suites and CI/CD CLI for Automated Regression Testing

AWS has extended its QA Studio reference solution with test suite functionality and a command-line interface for CI/CD integration. The updates enable parallel execution of regression tests on Amazon ECS Fargate and bring Amazon Nova Act-powered visual testing into automated deployment pipelines.

AWS launches hyperparameter optimization guide for Amazon Nova Forge custom model training

AWS launches hyperparameter optimization guide for Amazon Nova Forge custom model training

Key capabilities and training pipeline

Critical hyperparameter challenges

Training configuration

What this means

Related Articles

AWS launches Managed Knowledge Base for Bedrock with 6 enterprise connectors and automatic ACL enforcement

xAI's Grok 4.3 now available on AWS Bedrock with 1M token context and configurable reasoning

AWS launches AgentCore platform for building voice AI agents with Amazon Nova 2 Sonic

AWS Extends QA Studio with Test Suites and CI/CD CLI for Automated Regression Testing

Comments