AWS launches Nova Sonic voice agent framework with AgentCore Runtime and three integration patterns
AWS released Amazon Nova Sonic, a speech-to-speech foundation model for voice agents, alongside AgentCore Runtime, a serverless hosting environment with WebSocket streaming and microVM isolation. The framework supports three integration patterns: direct tool calls via AgentCore Gateway using Model Context Protocol (MCP), sub-agent delegation with Agent-to-Agent (A2A) protocol, and session segmentation for multi-step workflows.
AWS launches Nova Sonic voice agent framework with AgentCore Runtime and three integration patterns
AWS released Amazon Nova Sonic, a speech-to-speech foundation model for building voice agents, alongside Amazon Bedrock AgentCore Runtime, a new serverless hosting environment designed specifically for AI agent deployment.
Core components
Amazon Nova Sonic enables real-time voice interactions with natural conversational flow and tone understanding. The model handles speech-to-speech conversations without intermediate text transcription steps.
Amazon Bedrock AgentCore Runtime provides:
- Bidirectional WebSocket streaming with SigV4 authentication
- MicroVM-level session isolation to prevent latency spikes from concurrent sessions
- AgentCore Gateway for shared tool hosting using Model Context Protocol (MCP)
- Persistent memory across sessions
- Voice-specific telemetry including time-to-first-audio metrics
The system integrates with Strands Agents, an open-source framework. Its BidiAgent class manages bidirectional stream lifecycle, routes tool calls, and handles session management.
Three integration patterns
AWS documented three architectural approaches for voice agent design:
Pattern 1: AgentCore Gateway tool calls
Nova Sonic calls tools directly via AgentCore Gateway, which hosts MCP servers as managed endpoints. The voice model selects which tool to invoke, passes parameters, receives results, and responds. Example: A user asks "What's my account balance?" and Nova Sonic directly calls get_account_balance from available MCP tools.
Trade-off: All decision logic runs in the voice model's system prompt. Simple for basic tools but becomes brittle for multi-step workflows.
Pattern 2: Sub-agent delegation
Business logic runs in autonomous agents, each with its own model, system prompt, and tools. The voice orchestrator delegates complete tasks rather than individual tool calls. Two implementation approaches:
- Local agent-as-tool: Sub-agents run in-process as
@toolfunctions with no network hop - Remote agent via A2A protocol: Sub-agents deployed independently on AgentCore Runtime, invoked over network using Agent-to-Agent (A2A) protocol
A2A enables cross-framework interoperability between agents built with Strands, OpenAI, LangGraph, and Google ADK.
Pattern 3: Session segmentation
Isolates prompts, memory, and permissions across workflow stages. Not fully detailed in the release.
Implementation details
According to AWS, teams can expose existing business logic through AgentCore Gateway by configuring MCP Gateway ARNs:
model = BidiNovaSonicModel(
model_id="amazon.nova-2-sonic-v1:0",
mcp_gateway_arn=[
"arn:aws:bedrock-agentcore:us-east-1:123456789012:gateway/auth-tools",
"arn:aws:bedrock-agentcore:us-east-1:123456789012:gateway/banking-tools",
],
)
For sub-agent patterns, authentication and banking agents can be wrapped as tools using Strands' agent-as-tool pattern, with each sub-agent using separate models like amazon.nova-lite-v1:0.
What this means
This release positions AWS directly against voice agent frameworks from OpenAI (Realtime API) and Anthropic (Claude with tool use). The microVM isolation addresses a real production issue—latency spikes from concurrent sessions—that serverless function approaches struggle with. The MCP and A2A protocol support indicates AWS is betting on open standards rather than proprietary integration layers, which could accelerate enterprise adoption by reducing vendor lock-in concerns. The emphasis on "composable" agents through sub-agent patterns reflects industry movement away from monolithic LLM applications toward specialized, coordinated systems.
Related Articles
AWS enables fine-tuning of Amazon Nova models for email extraction, achieving 94.77% accuracy with 50% cost reduction
AWS released guidance on fine-tuning Amazon Nova Micro and Nova Lite models for automated email data extraction using SageMaker AI. In collaboration with Parcel Perform, the fine-tuned Nova Micro achieved 94.77% extraction accuracy—a 16.6 percentage point improvement—while reducing inference costs by 50% and latency by 30% compared to previous models.
Apple ships Safari MCP server in Technology Preview 247, enabling AI coding agents to inspect and debug websites
Apple has released an MCP server for Safari Technology Preview 247 that allows AI coding agents to directly inspect and debug websites. The server gives agents access to console logs, network requests, screenshots, and DOM interactions through the Model Context Protocol standard created by Anthropic.
AWS brings NVIDIA Nemotron and OpenAI GPT OSS models to GovCloud for secure government AI workloads
Amazon Bedrock now supports NVIDIA Nemotron and OpenAI GPT OSS models in AWS GovCloud (US) Regions. The launch includes OpenAI's GPT OSS models (120B and 20B parameters, 128K context) and NVIDIA Nemotron 3 family (9B to 120B parameters, 1M context), providing government agencies FedRAMP High and DoD SRG Level 5-compliant AI inference on U.S. soil.
AWS adds metadata filtering to AgentCore Memory, improving agent retrieval accuracy from 40% to 64%
Amazon has added metadata filtering to its AgentCore Memory service for AI agents. In AWS evaluations across 151 questions, the feature improved overall question-answering accuracy from 40% to 64%, with context-dependent questions jumping from 16% to 69% accuracy. The update allows agents to filter memory retrieval by attributes like priority, department, or time range before semantic search runs.
Comments
Loading...