LLM News

Every LLM release, update, and milestone.

Filtered by:multimodal-llms✕ clear

research

FreeAct framework relaxes quantization constraints for multimodal and diffusion LLMs

Researchers propose FreeAct, a quantization framework that abandons static one-to-one transformation constraints to handle dynamic activation patterns in multimodal and diffusion LLMs. The method assigns token-specific transformation matrices to activations while keeping weights unified, demonstrating up to 5.3% performance improvements over existing approaches.

March 6, 2026 · 5:05 AM2 min read

quantization efficiency multimodal-llms

via arxiv.org ↗

research

MLLMs can replace OCR for document extraction, large-scale study finds

A large-scale benchmarking study comparing multimodal large language models (MLLMs) against traditional OCR-enhanced pipelines for document information extraction finds that image-only inputs can achieve comparable performance. The research evaluates multiple out-of-the-box MLLMs on business documents and proposes an automated hierarchical error analysis framework using LLMs to diagnose failure modes.

March 5, 2026 · 1:39 AM2 min read

document-extraction multimodal-llms ocr

via arxiv.org ↗

research

Study questions whether OCR is still necessary for document extraction with modern MLLMs

A large-scale benchmarking study finds that modern multimodal large language models (MLLMs) can extract information from business documents nearly as well as traditional OCR+MLLM pipelines. The research introduces an automated error analysis framework and suggests that careful schema design and prompt engineering can further close the performance gap.

March 5, 2026 · 1:22 AM2 min read

multimodal-llms document-extraction ocr

via arxiv.org ↗