Baidu Launches CoBuddy Code Generation Model with 131K Context Window, Free on OpenRouter
Baidu has released CoBuddy, a code generation model optimized for coding tasks and AI agent workflows. The model features a 131K token context window, up to 65K output tokens, and runs on fp8 quantization with native support for tool calling and reasoning.
Baidu Qianfan: CoBuddy — Quick Specs
Baidu Launches CoBuddy Code Generation Model with 131K Context Window
Baidu has released CoBuddy, a code generation model from its Qianfan platform that is now available for free through OpenRouter. The model was released on May 6, 2026.
Technical Specifications
CoBuddy runs on fp8 quantization and supports a 131,072 token context window with up to 65,536 output tokens. According to Baidu, the model is optimized for high inference throughput and low end-to-end latency.
The model includes native support for tool calling and reasoning capabilities, positioning it for AI agent workflows. OpenRouter's implementation supports reasoning-enabled features that can show step-by-step thinking processes through a reasoning_details array in API responses.
Pricing and Availability
Baidu is offering CoBuddy at zero cost through OpenRouter, with $0 per million input tokens and $0 per million output tokens. The model is accessible through OpenRouter's unified API, which normalizes requests across multiple providers.
OpenRouter routes requests to available providers with automatic fallbacks for uptime optimization. The service supports both OpenAI SDK compatibility and OpenRouter's native SDK.
Target Use Cases
Baidu positions CoBuddy specifically for:
- Code generation tasks
- AI agent workflows requiring tool calling
- Applications needing reasoning transparency
- Development scenarios requiring large context windows
The fp8 quantization suggests Baidu is prioritizing inference efficiency, though the company has not disclosed benchmark scores or parameter count for the model.
What This Means
Baidu's decision to offer CoBuddy for free on OpenRouter represents a direct entry into the competitive code generation market dominated by models like Anthropic's Claude and OpenAI's GPT-4. The 131K context window and 65K output capacity are substantial, though not unprecedented—several recent models support similar or larger windows. The lack of disclosed benchmark scores makes it difficult to assess CoBuddy's actual capabilities relative to established code models. The free pricing suggests Baidu is prioritizing adoption and data collection over immediate monetization, a strategy common for new entrants seeking to establish market presence.
Related Articles
Cohere releases North Mini Code, a 30B-parameter sparse MoE coding model with 256K context window, free on OpenRouter
Cohere has released North Mini Code, the first model in its North family and its first agentic coding model. The sparse mixture-of-experts architecture features 30B total parameters with 3B active, a 256K-token context window, and up to 64K tokens of output, available free via OpenRouter under Apache 2.0 license.
Amazon Bedrock adds Gemma 4 models with 256K context and built-in reasoning mode
Amazon Web Services today announced availability of Google DeepMind's Gemma 4 family on Amazon Bedrock. The open-weight models include three instruction-tuned variants spanning 2.3B to 30.7B parameters, with 256K context windows, multimodal input support, and built-in reasoning mode.
US government forces Anthropic to pull Fable 5 and Mythos 5 models over guardrail bypass concerns
The US government forced Anthropic to withdraw its Fable 5 and Mythos 5 models, citing national security concerns after Amazon researchers allegedly discovered a method to bypass Fable 5's safety guardrails. Cybersecurity researchers have signed an open letter opposing the ban, with Anthropic noting similar vulnerabilities exist in competing models.
Mistral Releases Voxtral TTS: 4B Parameter Text-to-Speech Model at $0.016 per 1k Characters
Mistral AI has released Voxtral TTS, a 4B parameter text-to-speech model supporting 9 languages including English, French, German, Spanish, Dutch, Portuguese, Italian, Hindi, and Arabic. The model achieves 70ms latency for typical inputs and can clone voices from as little as 3 seconds of audio, priced at $0.016 per 1,000 characters.
Comments
Loading...