LLM News | TPS

research

Reasoning models fail at theory of mind tasks despite math excellence

A systematic study of nine advanced language models reveals that large reasoning models—designed to excel at step-by-step math and coding—actually underperform or match non-reasoning models on theory of mind tasks. The research identifies a critical weakness: longer reasoning chains actively harm social reasoning performance, suggesting current reasoning architectures don't transfer to socio-cognitive skills.

March 5, 2026 · 5:23 AM2 min read

theory-of-mind reasoning-models llm-evaluation

via arxiv.org ↗