product updateAmazon Web Services

AWS launches agentic AI movie assistant using Nova Sonic 2.0 and Bedrock AgentCore

TL;DR

Amazon Web Services unveiled an agentic AI system for streaming platforms combining Nova Sonic 2.0 (real-time speech model), Bedrock AgentCore, and the Model Context Protocol. The system delivers two core capabilities: context-aware movie recommendations based on mood and viewing history, and real-time scene analysis including actor identification and plot summaries.

2 min read
0

AWS Builds Agentic AI Movie Assistant With Nova Sonic 2.0

Amazon Web Services detailed a production-ready agentic AI system for streaming services that combines real-time voice interaction with contextual movie recommendations and scene analysis, addressing core limitations of traditional recommendation algorithms.

The Problem With Traditional ML Recommendations

Conventional collaborative and content-based filtering systems lack contextual understanding. After watching The Shawshank Redemption, a traditional system suggests more prison dramas—missing that the user might want something lighter to unwind. This context-dependent gap (time of day, mood, social setting) remains unaddressed by pattern-recognition-only approaches.

Architecture and Core Components

AWS deployed the system using:

Nova Sonic 2.0 — Amazon's latest speech-to-speech model handling real-time, bidirectional voice conversations with low latency. The model natively supports text and streaming speech inputs, with controllable personality via system prompts for on-brand responses.

Bedrock AgentCore — Orchestrates tool invocation, context management, and response curation. The system uses the Model Context Protocol (MCP) to expose AWS Lambda functions as MCP-compatible tools.

Infrastructure Stack:

  • AWS Fargate containers managing session orchestration via WebSocket connections
  • JWT token validation for security
  • Bidirectional Smithy streaming RPC protocol for model communication
  • OpenSearch + S3 Vector for semantic search and storage
  • Amazon Bedrock Data Automation for video processing

Two Core Use Cases

1. Mood-Aware Recommendations: Users describe their mental state ("something fun after a long day") rather than browsing history alone. Lambda functions retrieve user affinity profiles from DynamoDB, perform hybrid semantic search across 500+ sample movies in OpenSearch, and return contextually matched recommendations.

2. Real-Time Scene Analysis: While watching, users ask questions like "who is that actor?" or "summarize what just happened?" The system:

  • Uses Amazon Bedrock Data Automation to extract chapter summaries, transcriptions, timecodes, and audio segments from video
  • Applies Amazon Rekognition's celebrity recognition to identify actors
  • Matches user queries to movie scripts using semantic embeddings
  • Returns instant contextual answers via voice

Technical Workflow

User voice commands flow through WebSocket → Nova Sonic 2.0 → Bedrock AgentCore Gateway → Lambda functions → OpenSearch/S3 Vector → Results back to Nova Sonic for voice response → Streamed to client. Complex background tasks execute asynchronously while maintaining conversational fluidity.

What This Means

AWS is positioning agentic AI as the next generation of streaming recommendation systems, moving beyond static filtering toward dynamic, conversational discovery. The combination of real-time speech, tool orchestration via AgentCore, and semantic search creates a verifiable competitive advantage over traditional ML approaches. The architecture demonstrates how production-grade agentic systems can integrate multiple AWS services for complex workflows.

For streaming platforms, this represents a pathway to differentiation through conversational interfaces. For AWS, it showcases Bedrock AgentCore and Nova Sonic 2.0 maturity for enterprise use cases. The public GitHub repository suggests AWS intends this as a reference architecture for customers building similar systems.

Related Articles

product update

Gemini Live voice quality deteriorates after 3.1 Flash update, voices sound nothing like preview

Google's Gemini Live is experiencing persistent voice quality issues following the recent Gemini 3.1 Flash Live update. Users report that voice options like "Capella" (British female accent) have deteriorated significantly, with speech patterns changing dramatically during conversations and audio artifacts like crackles and pops becoming prominent.

product update

Microsoft expands Copilot Cowork with AI model critique feature and cross-model comparison

Microsoft is expanding Copilot Cowork availability and introducing a Critique function that enables one AI model to review another's output. The update also includes a new Researcher agent claiming best-in-class deep research performance, outperforming Perplexity by 7 points, and a Model Council feature for direct model comparison.

product update

Microsoft Copilot Researcher adds multi-model features using GPT and Claude

Microsoft has enabled its Copilot Researcher tool to simultaneously leverage OpenAI's GPT and Anthropic's Claude through two new features: Critique, which uses GPT responses refined by Claude, and Model Council, which displays side-by-side outputs with agreement/disagreement analysis. Both features are rolling out in the Microsoft 365 Copilot Frontier early access program.

product update

OpenAI shuts down Sora and indefinitely pauses ChatGPT adult mode in March purge

OpenAI shut down two projects in March 2026: the Sora AI video app (launched September 2025, operational for six months) and indefinitely paused the planned ChatGPT adult mode. The company cited sexual dataset management and illegal content elimination as barriers to the adult feature launch.

Comments

Loading...