LLM News

Every LLM release, update, and milestone.

Filtered by:data-synthesis✕ clear

research

Study reveals preference leakage bias when LLMs judge synthetically-trained models

A new arXiv paper identifies preference leakage, a fundamental contamination problem in LLM-based evaluation where language models used as judges systematically favor models trained on data they synthesized. The researchers confirm the bias occurs across multiple model families and benchmarks, making it harder to detect than previously known LLM judge biases.

March 5, 2026 · 5:25 AM3 min read

llm-evaluation benchmark-contamination data-synthesis

via arxiv.org ↗

research

Researchers develop data synthesis method to improve multimodal AI reasoning on charts and documents

A new research paper proposes COGS (COmposition-Grounded data Synthesis), a framework that decomposes questions into primitive perception and reasoning factors to generate synthetic training data. The method substantially improves multimodal model performance on chart reasoning and document understanding tasks with minimal human annotation.

March 5, 2026 · 5:24 AM2 min read

multimodal visual-reasoning data-synthesis

via arxiv.org ↗

researchAnthropic

Researchers achieve 141% improvement in agent training with just 312 human demonstrations

Researchers at GAIR-NLP have published PC Agent-E, an agent training framework that achieves a 141% relative improvement in computer use tasks starting from only 312 human-annotated trajectories. The method uses Claude 3.7 Sonnet to synthesize alternative action decisions, and the resulting model outperforms Claude 3.7 Sonnet by 10% on WindowsAgentArena-V2.

March 5, 2026 · 1:07 AM2 min read

agent-training computer-use data-synthesis

via arxiv.org ↗