LLM News | TPS

research

First benchmark for personalized deep research agents reveals gaps in current AI systems

Researchers introduced PDR-Bench, the first benchmark specifically designed to evaluate personalization in Deep Research Agents (DRAs). The benchmark pairs 50 research tasks across 10 domains with 25 authentic user profiles, creating 250 realistic queries that expose current limitations in how AI systems adapt to individual user contexts.

March 5, 2026 · 5:37 AM2 min read

deep-research-agents benchmarking personalization

via arxiv.org ↗