Amazon Bedrock adds programmatic tool calling to reduce latency and token usage in multi-step workflows
Amazon Bedrock now supports programmatic tool calling (PTC), a technique that allows LLMs to generate Python code for multi-step tool orchestration rather than making sequential API calls. AWS offers three implementation paths: self-hosted Docker sandboxes on ECS, managed execution via Amazon Bedrock AgentCore Code Interpreter, and Anthropic SDK-compatible proxy integration.
Amazon Bedrock adds programmatic tool calling to reduce latency and token usage in multi-step workflows
Amazon Web Services has introduced programmatic tool calling (PTC) for Amazon Bedrock, enabling language models to generate Python code that orchestrates multiple tool invocations in a single inference cycle rather than requiring sequential round trips.
How it works
Traditional tool calling requires models to invoke tools one at a time, with each call requiring a full inference round trip. For a query like "Which engineering team members exceeded their Q3 travel budget?", a model using standard tool calling would need to make 20+ separate API calls to retrieve team members and expense records, passing thousands of intermediate data points through its context window.
With PTC, the model generates Python code once that handles all tool calls, data processing, and filtering within a sandboxed execution environment. Using asyncio.gather(), the code can execute tool calls in parallel. Only the final processed result returns to the model's context.
According to AWS, this approach reduces both latency and token consumption for workflows involving multiple tool calls, data aggregation, or numerical calculations.
Three implementation options
AWS provides three ways to implement PTC on Bedrock:
Self-hosted Docker sandbox on Amazon ECS: Offers maximum control with model-agnostic support for Claude, Qwen, MiniMax, Llama, Nova, and other Bedrock models. Developers can customize the sandbox environment, install domain-specific Python packages, and keep code execution within their AWS account. The architecture uses an orchestrator (ECS task or Lambda) that calls the InvokeModel API via Boto3 and manages Docker sandbox lifecycle.
Managed solution via Amazon Bedrock AgentCore Code Interpreter: A fully managed sandbox environment that handles execution without requiring custom infrastructure.
Anthropic SDK-compatible proxy: Designed for teams already using Anthropic's SDK who want PTC functionality while maintaining their existing developer workflow.
System prompt engineering
The self-hosted implementation relies on injecting tool definitions into the system prompt rather than using the standard tool_config parameter. The prompt instructs models to write Python code with specific rules: each execute_code call runs in a fresh stateless environment, tool calls must use await, and all operations must complete in a single code block.
The orchestrator intercepts tool calls through IPC over stdin/stderr, executes them externally, and injects results back into the sandbox.
What this means
PTC addresses a legitimate bottleneck in agentic workflows where sequential tool calling creates compounding latency. The model-agnostic self-hosted option is particularly significant—it extends a pattern originally introduced by specific providers to any model available on Bedrock. This matters for enterprises already committed to AWS infrastructure who want to avoid vendor lock-in at the model level.
The technique works best for workflows involving data aggregation, filtering operations, or scenarios where intermediate data shouldn't enter the model's context for privacy reasons. However, it requires models capable of generating correct async Python code and adds complexity around sandbox security and resource management.
Related Articles
GitHub Copilot in VS Code Gains Browser Automation Tools for Web App Testing
GitHub has made browser tools for Copilot in VS Code generally available. The feature allows Copilot agents to control real browsers, navigate live web applications, and integrate findings back into the development environment.
AWS to Release Anthropic's Claude Fable 5 on Bedrock with Cybersecurity Guardrails
Amazon Web Services announced it will make Anthropic's Claude Fable 5 models available on Bedrock starting tomorrow, featuring guardrails designed to prevent cybersecurity misuse. When guardrails are triggered, the system automatically falls back to Claude Opus 4.8.
AWS launches managed entitlements for Bedrock to distribute third-party model access across multi-account organizations
AWS has introduced managed entitlements for Amazon Bedrock, allowing organizations to subscribe to third-party models like Anthropic Claude and Cohere from a central account and distribute access across member accounts without requiring AWS Marketplace permissions. The feature uses AWS License Manager to create grants that share model entitlements with specific accounts or entire organizational units.
Google AI Plus at $4.99/month and AI Pro at $19.99/month expand Gemini context windows to 128K and 1M tokens
Google has detailed pricing and features for its Gemini app subscription tiers. AI Plus costs $4.99/month and includes 128,000 token context windows, while AI Pro at $19.99/month provides 1 million token context windows. Free users are limited to 32,000 tokens.
Comments
Loading...