model release

Baidu Releases Unlimited-OCR, a 3B Parameter Document Parsing Model Based on Deepseek-OCR

TL;DR

Baidu has released Unlimited-OCR, a 3 billion parameter model for optical character recognition and document parsing. The model supports single-page and multi-page document processing with a 32,768 token context window and runs on NVIDIA GPUs using bfloat16 precision.

June 22, 2026 · 3:36 PM2 min read

Unlimited-OCR — Quick Specs

Context window33K tokens

Compare Unlimited-OCR with other models →

Baidu Releases Unlimited-OCR, a 3B Parameter Document Parsing Model Based on Deepseek-OCR

Baidu has released Unlimited-OCR, a 3 billion parameter optical character recognition model designed for document parsing. Released on June 22, 2025, the model builds on Deepseek-OCR and is available on Hugging Face.

Technical Specifications

Unlimited-OCR operates with a 32,768 token context window and uses bfloat16 precision on NVIDIA GPUs. The model requires PyTorch 2.10.0, transformers 4.57.1, and CUDA 12.9. According to Baidu, the model is positioned as "pushing Deepseek-OCR one step further" with support for "one-shot long-horizon parsing."

The model offers two processing modes:

Gundam mode: 1024 base size, 640 image size with crop mode enabled for single images
Base mode: 1024 base and image size without cropping for single images and all multi-page documents

Deployment Options

Unlimited-OCR can be deployed via Hugging Face transformers or SGLang server infrastructure. The SGLang deployment requires FlashAttention 3 backend and supports an OpenAI-compatible API with streaming responses.

For multi-page documents and PDFs, the model converts pages to images at 300 DPI before processing. The implementation includes custom logit processors with a 35-token n-gram constraint and configurable window sizes (128 tokens for single images, 1,024 tokens for multi-page documents).

Technical Implementation

The model uses PyMuPDF for PDF-to-image conversion and supports both single-image and multi-page inference. Base64-encoded images are sent to the model with text prompts like "document parsing" or "Multi page parsing." The SGLang server configuration allocates 80% of GPU memory statically and disables overlap scheduling.

Baidu acknowledges Deepseek-OCR, Deepseek-OCR-2, and PaddleOCR in the model documentation. Pricing information has not been disclosed.

What This Means

Unlimited-OCR adds another option to the OCR model landscape, though it remains unclear how performance compares to existing solutions like GPT-4V or Claude 3.5 Sonnet on document understanding tasks. The 3B parameter size suggests efficient inference, but no benchmark scores have been published. The model's value proposition depends on comparative accuracy data that Baidu has not yet provided.

Source: huggingface.co ↗

baidu ocr document-parsing computer-vision open-source deepseek

model releaseJune 21, 2026

Poolside releases Laguna M.1: 225B parameter MoE model scores 74.6% on SWE-bench Verified

Poolside has released Laguna M.1, a 225B total parameter Mixture-of-Experts model with 23B activated parameters per token, designed for agentic coding tasks. The model scores 74.6% on SWE-bench Verified and 63.1% on SWE-bench Multilingual, released under Apache 2.0 license.

model releaseJune 18, 2026

Mistral releases Leanstral, open-source 6B-parameter proof assistant for Lean 4 under Apache 2.0

Mistral AI has released Leanstral, a sparse 120B model with 6B active parameters designed specifically for the Lean 4 proof assistant. The model is available under Apache 2.0 license with free API access and achieves a 26.3 FLTEval score at pass@2, outperforming Claude Sonnet 4.6 while costing $36 versus $549.

model releaseJune 18, 2026

Mistral OCR 3 launches at $2 per 1,000 pages with 74% win rate over previous version

Mistral AI released Mistral OCR 3, a document extraction model priced at $2 per 1,000 pages ($1 with Batch API discount). The model achieves a 74% overall win rate over its predecessor on forms, scanned documents, complex tables, and handwriting according to internal benchmarks.

model releaseJune 18, 2026

Mistral Releases Mistral 3 Family: 675B-Parameter Large 3 MoE and Three Edge Models Under Apache 2.0

Mistral has released Mistral 3, including Mistral Large 3—a sparse mixture-of-experts model with 41B active and 675B total parameters—and three Ministral 3 edge models (3B, 8B, 14B). All models are released under Apache 2.0 license with multimodal capabilities and are available today on multiple platforms.

Baidu Releases Unlimited-OCR, a 3B Parameter Document Parsing Model Based on Deepseek-OCR

Unlimited-OCR — Quick Specs

Baidu Releases Unlimited-OCR, a 3B Parameter Document Parsing Model Based on Deepseek-OCR

Technical Specifications

Deployment Options

Technical Implementation

What This Means

Related Articles

Poolside releases Laguna M.1: 225B parameter MoE model scores 74.6% on SWE-bench Verified

Mistral releases Leanstral, open-source 6B-parameter proof assistant for Lean 4 under Apache 2.0

Mistral OCR 3 launches at $2 per 1,000 pages with 74% win rate over previous version

Mistral Releases Mistral 3 Family: 675B-Parameter Large 3 MoE and Three Edge Models Under Apache 2.0

Comments