LLM News

Every LLM release, update, and milestone.

Filtered by:information-extraction✕ clear
research

T2S-Bench benchmark reveals text-to-structure reasoning gap across 45 AI models

Researchers introduced T2S-Bench, a new benchmark with 1,800 samples across 6 scientific domains and 32 structural types, evaluating text-to-structure reasoning in 45 mainstream models. The benchmark reveals substantial capability gaps: average accuracy on multi-hop reasoning tasks is only 52.1%, while Structure-of-Thought (SoT) prompting alone yields +5.7% improvement on average across eight text-processing tasks.

research

MLLMs can replace OCR for document extraction, large-scale study finds

A large-scale benchmarking study comparing multimodal large language models (MLLMs) against traditional OCR-enhanced pipelines for document information extraction finds that image-only inputs can achieve comparable performance. The research evaluates multiple out-of-the-box MLLMs on business documents and proposes an automated hierarchical error analysis framework using LLMs to diagnose failure modes.