Amazon Bedrock now supports fine-tuning for Nova models with three customization approaches
Amazon Bedrock now enables fine-tuning of Amazon Nova models using supervised fine-tuning (SFT), reinforcement fine-tuning (RFT), and model distillation. The service automates infrastructure provisioning and training orchestration, requiring only data upload to S3 and a single API call. Fine-tuned models run on-demand at standard inference pricing without provisioned capacity requirements.
Amazon Nova 2 Lite — Quick Specs
Amazon Bedrock Adds Fine-tuning for Nova Models
Amazon has announced fine-tuning capabilities for Amazon Nova models through Amazon Bedrock, enabling customers to customize models for domain-specific tasks without deep machine learning expertise.
Three Customization Approaches
Bedrock supports three fine-tuning techniques:
Supervised Fine-tuning (SFT): Trains models on labeled input-output examples, embedding domain knowledge directly into model weights.
Reinforcement Fine-tuning (RFT): Uses reward functions—either custom code or an LLM acting as judge—to guide learning toward target behaviors.
Model Distillation: Transfers knowledge from larger teacher models into smaller, faster student models for resource-constrained environments.
All three approaches use parameter-efficient fine-tuning (PEFT), reducing memory requirements and training time while maintaining model quality compared to full fine-tuning.
Supported Models
Amazon Nova 2 Lite and Nova Micro support fine-tuning. Nova 2 Lite is a multimodal model with a 1-million token context window, processing text, images, and video for document processing, video understanding, and code generation. Nova Micro, the smallest in the lineup, targets low-cost inference for pipeline processing tasks like data extraction and address fixing.
Implementation and Pricing
Amazon Bedrock automates the entire training pipeline. Users upload training data to Amazon S3 and initiate the job via AWS Management Console, CLI, or API. The service manages infrastructure provisioning, compute allocation, and training orchestration—no cluster configuration required.
Fine-tuned models run on-demand at the same inference pricing as non-customized versions, with no provisioned capacity requirement. This contrasts with traditional approaches requiring expensive Provisioned Throughput.
Performance Gains
Amazon's internal testing demonstrated measurable improvements. Amazon Customer Service customized Nova Micro for specialized support, improving accuracy by 5.4% on domain-specific issues and 7.3% on general issues while reducing latency.
Fine-tuning eliminates token consumption overhead compared to prompt engineering and Retrieval-Augmented Generation (RAG), which supply context at inference time. While context-based techniques offer immediate deployment and dynamic updates, fine-tuning embeds knowledge directly, reducing cumulative token costs and improving generalization to novel phrasings and edge cases.
When to Fine-tune
Amazon recommends fine-tuning for high-volume, well-defined tasks with quality labeled examples—such as intent classification, brand voice consistency, or replacing traditional ML classifiers. The upfront investment in data labeling and training pays off through reduced per-request inference costs for applications with sustained traffic.
Fine-tuned small LLMs like Nova Micro increasingly replace traditional classifiers for tasks requiring flexibility with natural language variation without retraining.
Training Visibility
Bedrock provides sensible hyperparameter defaults (epochCount, learningRateMultiplier) and real-time training monitoring through loss curves. Clear documentation covers data preparation, format specifications, and schema requirements.
What this means
Bedrock's fine-tuning removes infrastructure barriers for model customization, making it accessible to teams without ML ops expertise. The on-demand pricing model—eliminating provisioned capacity costs—alters economics for domain-specific deployments. This positions Nova models as viable replacements for traditional classifiers in production pipelines, particularly where cost and latency matter more than raw capability. The focus on parameter-efficient approaches preserves inference speed, critical for high-volume applications.
Related Articles
AWS Launches Amazon Bedrock AgentCore for Deploying Production AI Agents
AWS has launched Amazon Bedrock AgentCore, a serverless runtime environment for deploying production AI agents. Turkish fulfillment company OPLOG demonstrated the platform's capabilities by building three business intelligence agents using Anthropic's Claude Sonnet, achieving a 35% reduction in sales cycles and 98% reduction in manual research time.
AWS releases four multimodal evaluators for image-to-text AI tasks in Strands Evals SDK
AWS has added four multimodal evaluators to its Strands Evals SDK that judge image-to-text AI outputs by directly analyzing source images. The evaluators—Overall Quality, Correctness, Faithfulness, and Instruction Following—use multimodal large language models to detect visual hallucinations, factual errors, and instruction violations that text-only judges miss.
Amazon Nova Act Becomes HIPAA Eligible for Healthcare Workflows
Amazon Nova Act, AWS's browser-based AI agent service, now qualifies as HIPAA eligible, allowing healthcare organizations to deploy autonomous agents for workflows involving electronically protected health information. The service automates repetitive browser tasks including claims processing, referral coordination, and prior authorization.
AWS SageMaker AI adds bidirectional streaming for real-time speech transcription with vLLM
Amazon SageMaker AI has launched bidirectional streaming support for real-time inference, enabling WebSocket-based voice applications through vLLM integration. The feature uses HTTP/2 on port 8443 to bridge client connections with vLLM's Realtime API, allowing audio to stream in while transcription streams back simultaneously over a single persistent connection.
Comments
Loading...