SageMaker

10 articles tagged with SageMaker

July 10, 2026

product updateNVIDIA

AWS Adds NVIDIA Nemotron 3 Nano (30B) and Super (120B) to SageMaker Serverless Fine-Tuning

Amazon SageMaker AI now supports serverless fine-tuning for NVIDIA Nemotron 3 Nano (30B parameters, 3B active) and Nemotron 3 Super (120B parameters, 12B active). The integration includes supervised fine-tuning, reinforcement learning with verifiable rewards (RLVR), and reinforcement learning from AI feedback (RLAIF).

July 10, 2026 · 3:50 PM

July 9, 2026

product updateAmazon Web Services

AWS SageMaker HyperPod adds three-tier data capture, direct Hugging Face deployment, and NVMe caching for enterprise inf

Amazon SageMaker HyperPod has launched infrastructure updates for enterprise inference workloads. The platform now captures inference data at three points—endpoint, load balancer, and model pod—with configurable sampling and S3 storage. Teams can deploy models directly from Hugging Face Hub without pre-staging weights, with support for gated access across vLLM, TGI, and SGLang runtimes.

July 9, 2026 · 4:50 PM

July 7, 2026

product updateAmazon Web Services

Hugging Face and AWS launch one-click deployment to SageMaker Studio

Hugging Face and Amazon Web Services have integrated a one-click workflow that takes developers from model discovery on Hugging Face directly into AWS SageMaker Studio. The integration eliminates manual setup steps by automatically provisioning domains with pre-configured IAM permissions and displaying GPU quota availability inline.

July 7, 2026 · 9:35 PM

June 4, 2026

model releaseNVIDIA

NVIDIA Nemotron 3 Ultra launches on AWS SageMaker with 550B parameters, 1M token context window

NVIDIA Nemotron 3 Ultra is now available on Amazon SageMaker JumpStart with 550 billion total parameters and 55 billion active parameters. The model features a hybrid Transformer-Mamba Mixture-of-Experts architecture and supports context windows up to 1 million tokens, targeting agentic AI workloads.

June 4, 2026 · 5:06 PM

May 20, 2026

product updateAmazon Web Services

AWS SageMaker AI adds bidirectional streaming for real-time speech transcription with vLLM

Amazon SageMaker AI has launched bidirectional streaming support for real-time inference, enabling WebSocket-based voice applications through vLLM integration. The feature uses HTTP/2 on port 8443 to bridge client connections with vLLM's Realtime API, allowing audio to stream in while transcription streams back simultaneously over a single persistent connection.

May 20, 2026 · 5:20 PM

May 4, 2026

product updateAmazon Web Services

AWS launches agent-guided workflows in SageMaker AI to automate model fine-tuning

Amazon Web Services has released agent-guided workflows in SageMaker AI that use AI coding agents to automate model customization. The feature includes nine pre-built skills covering use case definition, data preparation, fine-tuning technique selection (SFT, DPO, RLVR), evaluation, and deployment to Amazon Bedrock or SageMaker endpoints.

May 4, 2026 · 5:20 PM

product update

AWS SageMaker adds automatic instance fallback to prevent GPU capacity failures

Amazon SageMaker AI now supports capacity-aware instance pools that automatically try alternative GPU instance types when primary choices lack capacity. The feature works across endpoint creation, autoscaling, and scale-in operations, eliminating the manual retry cycles that previously left endpoints stuck in failed states.

May 4, 2026 · 4:20 PM

April 28, 2026

model releaseNVIDIA

NVIDIA Nemotron 3 Nano Omni: 30B-parameter multimodal model launches on AWS SageMaker with 131K token context

NVIDIA has launched Nemotron 3 Nano Omni on Amazon SageMaker JumpStart, a multimodal model with 30 billion total parameters (3 billion active) that processes video, audio, images, and text in a single inference pass. The model features a 131K token context window and uses a Mamba2 Transformer Hybrid MoE architecture combining three specialized encoders.

April 28, 2026 · 4:51 PM

April 17, 2026

product updateAmazon Web Services

AWS releases Nova Forge SDK data mixing guide to preserve general capabilities during fine-tuning

Amazon Web Services published a practical guide for fine-tuning Amazon Nova models using the Nova Forge SDK's data mixing capabilities. According to AWS, blending customer data with Amazon-curated datasets preserved near-baseline MMLU scores while delivering a 12-point F1 improvement on a Voice of Customer classification task spanning 1,420 leaf categories.

April 17, 2026 · 5:35 PM

April 16, 2026

product updateAmazon Web Services

Amazon Nova Micro Fine-Tuned Text-to-SQL Models Now Available on Bedrock On-Demand Inference at $0.80/Month for 22,000 Q

AWS has enabled fine-tuned Amazon Nova Micro models to run on Bedrock's on-demand inference for text-to-SQL generation. According to AWS testing, a sample workload of 22,000 queries per month costs $0.80 monthly using the serverless approach, compared to higher costs with persistent model hosting. The solution uses LoRA fine-tuning on the sql-create-context dataset containing over 78,000 SQL examples.

April 16, 2026 · 5:51 PM

← Back to all news