Anthropic Python SDK v0.104.0 adds thinking token count estimates for streaming responses
Anthropic released version 0.104.0 of its Python SDK on May 21, 2026. The update adds support for a thinking-token-count beta feature that provides estimated token counts in thinking block deltas when streaming responses from reasoning models.
Anthropic Python SDK v0.104.0 adds thinking token count estimates for streaming responses
Anthropic released version 0.104.0 of its Python SDK on May 21, 2026, adding support for tracking token usage in reasoning model thought processes.
What's new
The update introduces a thinking-token-count beta feature that provides estimated token counts within thinking block deltas during streaming responses. This allows developers to monitor token consumption in real-time as Claude's reasoning models process extended chains of thought.
The feature specifically targets streaming scenarios where Claude models with thinking capabilities—such as Claude 3.5 Sonnet with extended thinking mode—generate internal reasoning before producing final outputs.
Technical details
The implementation provides token count estimates as part of the delta stream, enabling developers to:
- Track thinking token usage during active streaming
- Estimate costs for reasoning operations in real-time
- Monitor and debug extended thinking processes
- Optimize prompts based on thinking token consumption
The feature is marked as beta, indicating the API may change in future releases.
Version information
- Version: 0.104.0
- Release date: May 21, 2026
- Type: Minor version update
- Full changelog: Available at github.com/anthropics/anthropic-sdk-python/compare/v0.103.1...v0.104.0
What this means
This update addresses a key observability gap for developers using Claude's reasoning capabilities. Previously, tracking token usage in thinking blocks required waiting for complete responses. With streaming token counts, developers can now monitor costs and performance in real-time, particularly important for applications using extended thinking modes where reasoning token counts can significantly exceed output tokens. The beta designation suggests Anthropic is still refining how thinking token metrics are calculated and surfaced to developers.
Related Articles
Cline v4.0.5 Adds Claude Sonnet 3.5 Support Across 7 API Providers
Cline, the VSCode AI coding assistant, released v4.0.5 with support for Anthropic's Claude Sonnet 3.5 across seven API providers. The update includes model picker integration and pricing corrections for the model.
Anthropic Python SDK v0.114.0 Adds Support for Claude Sonnet 5
Anthropic has released version 0.114.0 of its Python SDK, adding support for the claude-sonnet-5 model. The update also includes a bug fix for the agent toolset that allows absolute paths resolving inside the working directory.
Anthropic launches Claude Science beta with NVIDIA BioNeMo integration for life sciences research
Anthropic has launched the public beta of Claude Science, an AI workbench for scientific research that integrates NVIDIA's BioNeMo Agent Toolkit. The platform allows scientists to execute end-to-end research workflows using natural language commands to interact with digital agents.
Anthropic Restores Claude Fable 5 After Government Takedown, With Stricter Cybersecurity Blocks
Anthropic is redeploying Claude Fable 5 after a month-long government-mandated takedown triggered by Amazon researchers discovering a method to bypass the model's cybersecurity safeguards. The returning version includes enhanced safety classifiers that automatically block cybersecurity tasks and revert to Opus 4.8, with restricted availability through usage credits only.
Comments
Loading...