model release

Baidu Releases Qianfan-OCR-Fast Model with 66K Context at $0.68 Per 1M Input Tokens

TL;DR

Baidu has released Qianfan-OCR-Fast, a multimodal model specialized for optical character recognition tasks. The model offers a 66,000 token context window and is priced at $0.68 per 1M input tokens and $2.81 per 1M output tokens.

May 14, 2026 · 2:20 PM1 min read

Qianfan-OCR-Fast — Quick Specs

Context window66K tokens

Input$0.68/1M tokens

Output$2.81/1M tokens

Compare Qianfan-OCR-Fast with other models →

Baidu Releases Qianfan-OCR-Fast Model with 66K Context at $0.68 Per 1M Input Tokens

Baidu has released Qianfan-OCR-Fast, a multimodal model purpose-built for optical character recognition, with a 66,000 token context window and pricing of $0.68 per 1M input tokens.

Specifications

The model is available through OpenRouter with the following specifications:

Context window: 66,000 tokens
Input pricing: $0.68 per 1M tokens
Output pricing: $2.81 per 1M tokens
Model type: Multimodal (specialized for OCR)
Release date: Listed as April 20, 2026 (likely an error; actual release date unclear)

Technical Details

According to Baidu, Qianfan-OCR-Fast was trained on specialized OCR data while maintaining broader multimodal capabilities. The company claims it provides improved performance over its predecessor, Qianfan-OCR, though specific benchmark comparisons were not provided.

The model is designed to handle document understanding, text extraction, and related OCR tasks while retaining general multimodal intelligence for image understanding beyond pure text recognition.

Availability

Qianfan-OCR-Fast is currently available through OpenRouter's API routing service, which automatically selects providers based on prompt requirements and maintains fallback options for uptime. Weekly token usage on the platform stands at 273,000 tokens as of the listing date.

No information about direct API access through Baidu's own infrastructure was disclosed in the announcement.

What This Means

Baidu's OCR-specialized model enters a growing market for document understanding AI, competing with models from OpenAI (GPT-4 Vision), Anthropic (Claude 3), and Google (Gemini). The 66K context window is sufficient for processing lengthy documents in a single request, though it falls short of competitors offering 200K+ contexts. At $0.68 per 1M input tokens, pricing is competitive for specialized OCR tasks, particularly for high-volume document processing workflows where domain-specific optimization may justify the cost over general-purpose vision models.

Source: openrouter.ai ↗

Baidu Qianfan OCR multimodal document-understanding OpenRouter model-release

model releaseMay 12, 2026

Perceptron Launches Mk1 Vision-Language Model with Video Reasoning at $0.15/$1.50 per 1M Tokens

Perceptron has released Perceptron Mk1, a vision-language model designed for video understanding and embodied reasoning tasks. The model accepts image and video inputs with 33K context window, priced at $0.15 per 1M input tokens and $1.50 per 1M output tokens, and supports structured spatial annotations on demand.

model releaseMay 7, 2026

Google releases Gemini 3.1 Flash Lite with 1M context at $0.25 per million input tokens

Google has released Gemini 3.1 Flash Lite, a high-efficiency multimodal model with a 1,048,576 token context window priced at $0.25 per million input tokens and $1.50 per million output tokens. The model supports text, image, video, audio, and PDF inputs with four thinking levels for cost-performance optimization.

model releaseMay 6, 2026

Baidu Launches CoBuddy Code Generation Model with 131K Context Window, Free on OpenRouter

Baidu has released CoBuddy, a code generation model optimized for coding tasks and AI agent workflows. The model features a 131K token context window, up to 65K output tokens, and runs on fp8 quantization with native support for tool calling and reasoning.

model releaseMay 13, 2026

DeepSeek Releases V4 Flash: 284B-Parameter MoE Model with 1M Context Window, Free via OpenRouter

DeepSeek has released V4 Flash, a Mixture-of-Experts model with 284B total parameters and 13B activated parameters per forward pass. The model supports a 1M-token context window and is available free through OpenRouter, targeting high-throughput coding and chat applications.

Baidu Releases Qianfan-OCR-Fast Model with 66K Context at $0.68 Per 1M Input Tokens

Qianfan-OCR-Fast — Quick Specs

Baidu Releases Qianfan-OCR-Fast Model with 66K Context at $0.68 Per 1M Input Tokens

Specifications

Technical Details

Availability

What This Means

Related Articles

Perceptron Launches Mk1 Vision-Language Model with Video Reasoning at $0.15/$1.50 per 1M Tokens

Google releases Gemini 3.1 Flash Lite with 1M context at $0.25 per million input tokens

Baidu Launches CoBuddy Code Generation Model with 131K Context Window, Free on OpenRouter

DeepSeek Releases V4 Flash: 284B-Parameter MoE Model with 1M Context Window, Free via OpenRouter

Comments