research
VC-STaR: Researchers use visual contrast to reduce hallucinations in VLM reasoning
Researchers propose Visual Contrastive Self-Taught Reasoner (VC-STaR), a self-improving framework that addresses a fundamental challenge in vision language models: hallucinations in visual reasoning. The approach uses contrastive VQA pairs—visually similar images with equivalent questions—to improve how VLMs identify relevant visual cues and generate more accurate reasoning paths.