Baidu Releases Qianfan-OCR-Fast Model with 66K Context at $0.68 Per 1M Input Tokens
Baidu has released Qianfan-OCR-Fast, a multimodal model specialized for optical character recognition tasks. The model offers a 66,000 token context window and is priced at $0.68 per 1M input tokens and $2.81 per 1M output tokens.
Qianfan-OCR-Fast — Quick Specs
Baidu Releases Qianfan-OCR-Fast Model with 66K Context at $0.68 Per 1M Input Tokens
Baidu has released Qianfan-OCR-Fast, a multimodal model purpose-built for optical character recognition, with a 66,000 token context window and pricing of $0.68 per 1M input tokens.
Specifications
The model is available through OpenRouter with the following specifications:
- Context window: 66,000 tokens
- Input pricing: $0.68 per 1M tokens
- Output pricing: $2.81 per 1M tokens
- Model type: Multimodal (specialized for OCR)
- Release date: Listed as April 20, 2026 (likely an error; actual release date unclear)
Technical Details
According to Baidu, Qianfan-OCR-Fast was trained on specialized OCR data while maintaining broader multimodal capabilities. The company claims it provides improved performance over its predecessor, Qianfan-OCR, though specific benchmark comparisons were not provided.
The model is designed to handle document understanding, text extraction, and related OCR tasks while retaining general multimodal intelligence for image understanding beyond pure text recognition.
Availability
Qianfan-OCR-Fast is currently available through OpenRouter's API routing service, which automatically selects providers based on prompt requirements and maintains fallback options for uptime. Weekly token usage on the platform stands at 273,000 tokens as of the listing date.
No information about direct API access through Baidu's own infrastructure was disclosed in the announcement.
What This Means
Baidu's OCR-specialized model enters a growing market for document understanding AI, competing with models from OpenAI (GPT-4 Vision), Anthropic (Claude 3), and Google (Gemini). The 66K context window is sufficient for processing lengthy documents in a single request, though it falls short of competitors offering 200K+ contexts. At $0.68 per 1M input tokens, pricing is competitive for specialized OCR tasks, particularly for high-volume document processing workflows where domain-specific optimization may justify the cost over general-purpose vision models.
Related Articles
OpenAI previews GPT-5.6 to select partners with three variants priced from $1 to $30 per million tokens
OpenAI has begun previewing its GPT-5.6 series to a limited group of trusted partners after government review. The release includes three variants: Sol at $5 input/$30 output per million tokens, Terra at $2.50/$15, and Luna at $1/$6.
OpenAI announces GPT-5.6 series with Sol flagship, Terra at 50% cost of GPT-5.5, and Luna budget model
OpenAI has begun a limited preview of its GPT-5.6 series, introducing three models: Sol (flagship), Terra (2x cheaper than GPT-5.5 with competitive performance), and Luna (lowest cost option). The models are launching first with trusted partners before general availability in coming weeks, following U.S. government preview requirements.
OpenAI's ChatGPT 5.6 release restricted to government-approved customers initially
OpenAI will release ChatGPT 5.6 first to customers approved by the federal government, according to a staff memo from CEO Sam Altman. The company plans a broader release "a couple of weeks later," marking a significant departure from typical model rollouts.
DeepSeek-V4-Fable: Offensive Security Model Trained on 80,000 CTF Trajectories Achieves 58.7% Solve Rate
Chunjiang Intelligence has released DeepSeek-V4-Fable, an autonomous agent model designed for offensive security research and CTF challenges. The model, distilled from Claude-5-Fable and built on DeepSeek-V4-Flash, was trained on 80,000 verified CTF trajectories and achieves a 58.7% solve rate across held-out security challenges.
Comments
Loading...