model release

Baidu Launches CoBuddy Code Generation Model with 131K Context Window, Free on OpenRouter

TL;DR

Baidu has released CoBuddy, a code generation model optimized for coding tasks and AI agent workflows. The model features a 131K token context window, up to 65K output tokens, and runs on fp8 quantization with native support for tool calling and reasoning.

May 6, 2026 · 3:05 AM2 min read

Baidu Qianfan: CoBuddy — Quick Specs

Context window131K tokens

Compare Baidu Qianfan: CoBuddy with other models →

Baidu Launches CoBuddy Code Generation Model with 131K Context Window

Baidu has released CoBuddy, a code generation model from its Qianfan platform that is now available for free through OpenRouter. The model was released on May 6, 2026.

Technical Specifications

CoBuddy runs on fp8 quantization and supports a 131,072 token context window with up to 65,536 output tokens. According to Baidu, the model is optimized for high inference throughput and low end-to-end latency.

The model includes native support for tool calling and reasoning capabilities, positioning it for AI agent workflows. OpenRouter's implementation supports reasoning-enabled features that can show step-by-step thinking processes through a reasoning_details array in API responses.

Pricing and Availability

Baidu is offering CoBuddy at zero cost through OpenRouter, with $0 per million input tokens and $0 per million output tokens. The model is accessible through OpenRouter's unified API, which normalizes requests across multiple providers.

OpenRouter routes requests to available providers with automatic fallbacks for uptime optimization. The service supports both OpenAI SDK compatibility and OpenRouter's native SDK.

Target Use Cases

Baidu positions CoBuddy specifically for:

Code generation tasks
AI agent workflows requiring tool calling
Applications needing reasoning transparency
Development scenarios requiring large context windows

The fp8 quantization suggests Baidu is prioritizing inference efficiency, though the company has not disclosed benchmark scores or parameter count for the model.

What This Means

Baidu's decision to offer CoBuddy for free on OpenRouter represents a direct entry into the competitive code generation market dominated by models like Anthropic's Claude and OpenAI's GPT-4. The 131K context window and 65K output capacity are substantial, though not unprecedented—several recent models support similar or larger windows. The lack of disclosed benchmark scores makes it difficult to assess CoBuddy's actual capabilities relative to established code models. The free pricing suggests Baidu is prioritizing adoption and data collection over immediate monetization, a strategy common for new entrants seeking to establish market presence.

Source: openrouter.ai ↗

Baidu CoBuddy code generation AI agents tool calling reasoning OpenRouter free model

model releaseApril 30, 2026

OpenRouter Launches Owl Alpha: Free Foundation Model for Agentic Workflows with 1M Context

OpenRouter has released Owl Alpha, a foundation model specifically designed for agentic workloads with native tool use support and a 1,048,756 token context window. The model is currently free for both input and output tokens and is compatible with Claude Code, OpenClaw, and other productivity tools.

model releaseMay 2, 2026

NVIDIA releases Nemotron-3-Nano-Omni-30B, a 31B-parameter multimodal model with 256K context and reasoning mode

NVIDIA released Nemotron-3-Nano-Omni-30B-A3B, a multimodal large language model with 31 billion parameters that processes video, audio, images, and text with up to 256K token context. The model uses a Mamba2-Transformer hybrid Mixture of Experts architecture and supports chain-of-thought reasoning mode.

model releaseApril 29, 2026

Mistral Releases Medium 3.5: 128B Dense Model With 256k Context and Configurable Reasoning

Mistral AI released Mistral Medium 3.5, a 128B parameter dense model with a 256k context window that unifies instruction-following, reasoning, and coding capabilities. The model features configurable reasoning effort per request and a vision encoder trained from scratch for variable image sizes.