AWS demonstrates object detection using Amazon Nova 2 Lite multimodal model with no training required

TL;DR

AWS published a technical guide showing how Amazon Nova 2 Lite performs object detection through natural language prompts without requiring model training. The multimodal model returns bounding box coordinates in JSON format at $0.0003 per thousand input tokens and $0.0025 per thousand output tokens, with typical images costing approximately $0.00057 to process.

June 2, 2026 · 5:50 PM2 min read

AWS demonstrates object detection using Amazon Nova 2 Lite multimodal model with no training required

AWS published a technical guide showing how Amazon Nova 2 Lite performs object detection through natural language prompts without requiring model training, data pipelines, or dedicated infrastructure.

Pricing and capabilities

Amazon Nova 2 Lite costs $0.0003 per thousand input tokens and $0.0025 per thousand output tokens when accessed through Amazon Bedrock. According to AWS, a typical image consumes approximately 230 input tokens ($0.000069) and generates around 200 output tokens ($0.0005), totaling roughly $0.00057 per image. Processing 10,000 images would cost approximately $5.69.

The model accepts natural language prompts specifying objects to detect (such as "vehicle", "person", or "dent") and returns bounding box coordinates in structured JSON format. Coordinates use a normalized 0-1000 scale that developers convert to pixel positions.

Technical implementation

The implementation uses Amazon Bedrock's Converse API with a prompt engineering template that specifies detection requirements. The prompt includes two dynamic variables: elements (object types to detect) and schema (expected JSON structure). The system requires no fine-tuning or training data.

AWS tested the model on a street scene, asking it to detect "vehicle" and "stop sign" objects. According to AWS, Nova 2 Lite detected small, distant, and partially occluded objects with tight bounding boxes using only basic object names.

Architecture and deployment

AWS released a reference serverless application architecture combining AWS Lambda, Amazon API Gateway, Amazon CloudFront, and Amazon S3. The Lambda function orchestrates requests to Amazon Bedrock, converts normalized coordinates to pixel positions, and renders bounding boxes on images.

The architecture supports deployment on AWS Lambda for event-driven workloads, Amazon EC2 for custom configurations, or Amazon ECS/EKS for containerized deployments. All compute options use the same Bedrock Converse API.

AWS estimates deployment takes 30-45 minutes and provides complete source code including AWS CDK infrastructure definitions in a GitHub repository.

What this means

Amazon Nova 2 Lite offers a low-cost alternative to traditional computer vision pipelines that require data collection, model training infrastructure, and ML expertise. At under $0.001 per image, the model makes object detection economically viable for small teams and prototyping scenarios. The prompt-based approach eliminates training costs but likely sacrifices accuracy compared to domain-specific models trained on custom datasets. The reference architecture demonstrates production-ready deployment patterns, though AWS has not published benchmark comparisons against established computer vision models or disclosed Nova 2 Lite's base parameter count.

Source: aws.amazon.com ↗

Amazon Nova Amazon Bedrock object detection computer vision AWS Lambda multimodal models serverless

product updateJuly 14, 2026

Amazon Nova Act Brings Vision-Based Web Navigation to UX Testing, No Hard-Coded Scripts Required

AWS has released a cloud-deployed UX testing platform built on Amazon Nova Act, a multimodal foundation model that navigates web interfaces through visual understanding rather than hard-coded selectors. The solution processes documentation with Claude 4.5 Sonnet to generate test scenarios, executes parallel testing via ECS, and analyzes results automatically, addressing the scalability limitations of manual testing and maintenance overhead of traditional automation tools.

product updateJuly 16, 2026

AWS launches Managed Knowledge Base for Bedrock with 6 enterprise connectors and automatic ACL enforcement

Amazon Web Services launched Managed Knowledge Base for Bedrock in general availability, offering a fully managed retrieval solution with six native enterprise connectors including SharePoint, Confluence, and Google Drive. The service handles document parsing up to 500 MB for PDFs, 2 GB for audio, and 10 GB for video, with real-time access control list verification at query time.

product updateJuly 16, 2026

xAI's Grok 4.3 now available on AWS Bedrock with 1M token context and configurable reasoning

xAI has made Grok 4.3 generally available on Amazon Bedrock, marking xAI's debut as a Bedrock model provider. The multimodal model offers a 1 million token context window, configurable reasoning effort (none/low/medium/high), and runs on Bedrock's Mantle inference engine using OpenAI-compatible APIs.

product updateJuly 16, 2026

AWS launches AgentCore platform for building voice AI agents with Amazon Nova 2 Sonic

AWS has released AgentCore, a new platform for hosting and running voice-based AI agents, integrated with Amazon Nova 2 Sonic for real-time speech capabilities. The platform uses the open Model Context Protocol (MCP) to connect agents to backend systems and deploys each conversation in isolated microVMs.

AWS demonstrates object detection using Amazon Nova 2 Lite multimodal model with no training required

AWS demonstrates object detection using Amazon Nova 2 Lite multimodal model with no training required

Pricing and capabilities

Technical implementation

Architecture and deployment

What this means

Related Articles

Amazon Nova Act Brings Vision-Based Web Navigation to UX Testing, No Hard-Coded Scripts Required

AWS launches Managed Knowledge Base for Bedrock with 6 enterprise connectors and automatic ACL enforcement

xAI's Grok 4.3 now available on AWS Bedrock with 1M token context and configurable reasoning

AWS launches AgentCore platform for building voice AI agents with Amazon Nova 2 Sonic

Comments