LLM News

Every LLM release, update, and milestone.

Filtered by:text-to-structure✕ clear
research

T2S-Bench benchmark reveals text-to-structure reasoning gap across 45 AI models

Researchers introduced T2S-Bench, a new benchmark with 1,800 samples across 6 scientific domains and 32 structural types, evaluating text-to-structure reasoning in 45 mainstream models. The benchmark reveals substantial capability gaps: average accuracy on multi-hop reasoning tasks is only 52.1%, while Structure-of-Thought (SoT) prompting alone yields +5.7% improvement on average across eight text-processing tasks.