OCR
5 articles tagged with OCR
AWS launches Amazon Bedrock Data Automation for financial document processing with custom blueprint system
Amazon Web Services released Amazon Bedrock Data Automation (BDA), a foundation model-powered service designed to extract and validate structured data from financial documents. The service uses custom blueprints to process bank statements, W-2 tax forms, 1099-B forms, and vendor contracts, offering what AWS claims is industry-leading accuracy at lower cost than using foundation models directly.
Baidu Releases Qianfan-OCR-Fast Model with 66K Context at $0.68 Per 1M Input Tokens
Baidu has released Qianfan-OCR-Fast, a multimodal model specialized for optical character recognition tasks. The model offers a 66,000 token context window and is priced at $0.68 per 1M input tokens and $2.81 per 1M output tokens.
Perceptron Launches Mk1 Vision-Language Model with Video Reasoning at $0.15/$1.50 per 1M Tokens
Perceptron has released Perceptron Mk1, a vision-language model designed for video understanding and embodied reasoning tasks. The model accepts image and video inputs with 33K context window, priced at $0.15 per 1M input tokens and $1.50 per 1M output tokens, and supports structured spatial annotations on demand.
NVIDIA releases Nemotron-3-Nano-Omni-30B, a 31B-parameter multimodal model with 256K context and reasoning mode
NVIDIA released Nemotron-3-Nano-Omni-30B-A3B, a multimodal large language model with 31 billion parameters that processes video, audio, images, and text with up to 256K token context. The model uses a Mamba2-Transformer hybrid Mixture of Experts architecture and supports chain-of-thought reasoning mode.
Baidu Releases Free Qianfan-OCR-Fast Model with 65K Context Window
Baidu has released Qianfan-OCR-Fast, a specialized OCR model with a 65,536 token context window, available at zero cost through OpenRouter. The model launched on April 20, 2026, and is positioned as a performance upgrade over the original Qianfan-OCR.