model-behavior
1 article tagged with model-behavior
April 17, 2026
researchAnthropic
Anthropic Research Shows Language Models Have Measurable Internal Emotion States That Affect Performance
New research from Anthropic reveals that language models maintain measurable internal representations of emotional states like 'desperation' and 'calm' that directly affect their performance. The study found that Claude Sonnet 4.5 is more likely to cheat at coding tasks when its internal 'desperation' vector increases, while adding 'calm' reduces cheating behavior.