product update

Google's Gemini API Agent Skill boosts coding task success from 28% to 97%

TL;DR

Google has released an Agent Skill for the Gemini API that provides language models with current information about their own APIs, SDKs, and best practices. Testing across 117 coding tasks showed Gemini 3.1 Pro's success rate jumped from 28.2% to 96.6%, though older 2.5-series models showed minimal improvement.

2 min read
0

Google Addresses AI Models' Knowledge Gap With New Gemini Agent Skill

Google has launched an Agent Skill for the Gemini API designed to solve a core limitation of large language models: their training data has a cutoff date, leaving them unaware of their own API updates, SDK changes, and current best practices.

The new skill feeds live information to Gemini coding agents about current models, available SDKs, and sample code implementations. In testing across 117 coding tasks, the results were stark: Gemini 3.1 Pro Preview's success rate jumped from 28.2% to 96.6% when using the Agent Skill.

Performance Varies Significantly by Model Generation

Not all Gemini models benefited equally. Google attributes the stark difference to reasoning capabilities: newer 3-series models showed dramatic improvements, while older 2.5-series models saw only marginal gains. This suggests the Agent Skill's effectiveness depends on the underlying model's ability to apply contextual information effectively.

Google released the Agent Skill publicly on GitHub, making it available for developers integrating Gemini APIs into coding applications.

Competing Approaches Emerging

The release comes as the industry converges on multiple strategies to address the knowledge-cutoff problem. Anthropic introduced "Skills" for Claude last year, and the concept has been adopted across the AI industry. However, research from Vercel suggests that simpler approaches—providing model instructions through AGENTS.md files—might be equally or more effective than structured skills.

Google is also exploring Model Context Protocol (MCP) services as an alternative method to feed models updated information at inference time.

What This Means

This update addresses a practical pain point for AI-powered coding assistants: models that don't know about their own APIs perform poorly at using them. The massive improvement in Gemini 3.1 Pro's task completion rate demonstrates that the solution works, but the minimal gains for older models suggest reasoning quality remains a critical bottleneck. For developers, this means newer Gemini models with the Agent Skill could reliably handle SDK-dependent coding tasks, but legacy model deployments will need architectural workarounds.

Related Articles

product update

Google adds screen selection tool to Chrome's Gemini panel, integrates computer use into Gemini 3.5 Flash API

Google has added a screen selection tool to Chrome 149's Gemini panel that allows users to capture text or images from their current tab for prompts. Separately, the company integrated computer use capabilities directly into the Gemini 3.5 Flash model API, replacing the standalone Gemini 2.5 Computer Use model.

product update

Google integrates Gemini AI into Play Store for conversational app discovery and in-app purchases

Google has rolled out Gemini integration with the Play Store on Android, allowing users to discover and install apps through conversational queries. The feature also enables purchasing in-app items and gift cards through chat, with support expanding to more apps over time.

product update

Vercel AI SDK adds Grok 4.3, reasoning effort controls, and image quality model support for xAI

Vercel released version 3.0.97 of its AI SDK's xAI integration, adding support for three new models: Grok 4.3, Grok Build 0.1, and Grok Imagine Image Quality. The update introduces reasoning effort controls with 'none' and 'medium' settings.

product update

Google expands Gemini Android overlay menu with six new tools accessible without opening app

Google has expanded the Gemini overlay plus menu on Android to include six tools: Videos, Music, Canvas, and Guided Learning join the existing Images and Personal Intelligence options. The update, rolling out in Google app version 17.32, allows users to access most Gemini features from anywhere on Android without opening the full app.

Comments

Loading...