tool-use
10 articles tagged with tool-use
Mistral Launches Agents API with Code Execution, Web Search, and MCP Tool Integration
Mistral AI has released its Agents API, a framework for building AI agents with built-in connectors for code execution, web search, image generation, and Model Context Protocol tools. The API includes persistent conversation memory and multi-agent orchestration capabilities, enabling agents to maintain context across interactions and coordinate complex workflows.
Cohere Releases Command A+ Open Source Model with 25B Active Parameters, 128K Context
Cohere has released Command A+ as an open source model under Apache 2.0 license. The sparse mixture-of-experts architecture features 25 billion active parameters out of 218B total parameters, supports 128K input context length, and includes vision capabilities alongside tool use and reasoning features.
Cohere Releases Command A+: 218B-Parameter MoE Model With 4-Bit Quantization Runs on Single B200 GPU
Cohere has released Command A+, an open-source sparse mixture-of-experts model with 218 billion total parameters and 25 billion active parameters. The model features W4A4 quantization allowing deployment on a single Nvidia B200 GPU, supports 128K input context, and includes built-in chain-of-thought reasoning with vision capabilities.
IBM Research launches Open Agent Leaderboard, showing same models achieve different results based on agent architecture
IBM Research has launched the Open Agent Leaderboard, the first open benchmark that evaluates complete AI agent systems rather than just underlying models. The leaderboard reveals that agents using identical models can achieve significantly different success rates and costs depending on system architecture, with failed runs costing 20-54% more than successful ones.
InclusionAI Releases Ring-2.6-1T: 1 Trillion Parameter Thinking Model with 63B Active Parameters
InclusionAI has released Ring-2.6-1T, a 1 trillion parameter-scale model with 63 billion active parameters and a 262,144-token context window. The model features adaptive reasoning modes and is designed for coding agents, tool use, and long-horizon task execution.
GLM-5.1 achieves 58.4% on SWE-Bench Pro with sustained agentic reasoning over hundreds of iterations
Zhipu AI has released GLM-5.1, a 754-billion parameter model designed for agentic engineering with significantly improved coding capabilities over its predecessor. The model achieves 58.4% on SWE-Bench Pro and demonstrates sustained performance improvement over hundreds of tool calls and iterations, unlike earlier models that plateau quickly.
Google releases official iPhone app for running Gemma 4 models locally
Google launched AI Edge Gallery, an official iPhone app for running Gemma 4 models (E2B and E4B sizes) directly on device. The E2B model downloads at 2.54GB and delivers fast inference with image analysis, audio transcription up to 30 seconds, and tool calling capabilities.
AWS adds Claude tool use to Bedrock for custom entity extraction from documents
Amazon Web Services has integrated Claude's tool use (function calling) capability into Bedrock, enabling serverless document processing for custom entity recognition. The solution uses Claude 3.5 Sonnet to extract structured data like names, dates, and addresses from driver's licenses and other documents without traditional model training.
Amazon Nova 2 Lite surpasses Nova 1 Pro with 1M token context and extended thinking at 7x lower cost
Amazon Nova 2 Lite expands context window to 1 million tokens, introduces extended thinking with developer controls, and adds native tool use and web grounding. AWS claims Nova 2 Lite surpasses Nova 1 Pro on multi-step reasoning while costing 7x less and running up to 5x faster.
OpenAI Python SDK v2.25.0 adds GPT-5.4 support with new tool search and computer control features
OpenAI has released version 2.25.0 of its Python SDK, adding support for GPT-5.4 and introducing a new tool search feature alongside a computer control tool for agent-based automation. The update, released March 5, 2026, also includes API schema refinements and parameter changes to the prompt cache and message handling.