Hallucinated citations slip through peer review at top AI conferences; CiteAudit tool targets the problem
Accepted papers at major AI conferences contain fabricated citations—references to publications that don't exist. A new open-source tool called CiteAudit is the first systematic attempt to detect and eliminate hallucinated references from peer-reviewed research.
Hallucinated Citations Are Passing Peer Review at Top AI Conferences
Accepted papers at leading AI research conferences contain fabricated citations—references that point to non-existent publications. This verification failure represents a crack in the peer review process at institutions like NeurIPS, ICML, and ICLR.
A new open-source tool called CiteAudit is the first systematic effort to detect and flag these hallucinated references. Rather than relying on manual verification by reviewers and editors, CiteAudit automates the detection of citations that fail to correspond to real publications.
The Scale of the Problem
The prevalence of hallucinated references in peer-reviewed AI papers suggests that:
- Current peer review workflows lack citation verification mechanisms
- Reviewers may not be cross-checking reference accuracy
- AI-generated content in papers is increasing faster than verification infrastructure
This creates a compounding problem: papers with fabricated citations can influence downstream research, waste researcher time on false leads, and undermine confidence in the peer review process itself.
How CiteAudit Works
The tool automates citation verification by checking whether cited papers exist in academic databases and publication registries. By scanning accepted conference papers, CiteAudit can identify references that fail validation, flagging them for human review.
As an open-source project, CiteAudit allows:
- Conference organizers to screen submissions before publication
- Researchers to audit their own work
- The community to contribute detection improvements
Implications for Research Quality
The existence of hallucinated references in published papers points to:
- AI in paper writing: Researchers increasingly use language models to draft or structure papers, and these models frequently fabricate citations
- Review pressure: Fast-moving conference cycles may reduce the thoroughness of citation checking
- Infrastructure gap: Peer review systems were designed before widespread AI-generated content
Conferences and journals now face a choice: integrate citation verification into their standard review process, or allow hallucinated references to proliferate in the literature.
What This Means
CiteAudit represents the first practical defense against a specific class of AI hallucinations that directly damage scientific integrity. Without systematic detection, hallucinated citations will continue eroding the reliability of peer-reviewed AI research. Conference organizers may soon need to adopt citation verification tools as standard practice—similar to how plagiarism detection became routine. This is less about controlling AI and more about maintaining the basic mechanisms of scientific trust.
Related Articles
Apple to present 60 AI research studies at ICLR 2026, including SHARP 3D reconstruction model
Apple will present nearly 60 research studies and technical demonstrations at the International Conference on Learning Representations (ICLR) running April 23-27 in Rio de Janeiro. Demos include the SHARP model that reconstructs photorealistic 3D scenes from a single image in under one second, running on iPad Pro with M5 chip.
Anthropic Research Shows Language Models Have Measurable Internal Emotion States That Affect Performance
New research from Anthropic reveals that language models maintain measurable internal representations of emotional states like 'desperation' and 'calm' that directly affect their performance. The study found that Claude Sonnet 4.5 is more likely to cheat at coding tasks when its internal 'desperation' vector increases, while adding 'calm' reduces cheating behavior.
Physical Intelligence's π0.7 robot model performs tasks outside its training data
Physical Intelligence published research showing its π0.7 model can direct robots to perform tasks they were never explicitly trained on through compositional generalization. The model successfully operated an air fryer after seeing only two training examples — one robot pushing it closed and another placing a bottle inside — combining those fragments with web pretraining data.
Anthropic study shows LLMs transfer hidden biases through distillation even when scrubbed from training data
Anthropic researchers demonstrated that student LLMs inherit undesirable traits from teacher models through distillation, even when those traits are removed from training data. In experiments using GPT-4.1 nano, student models exhibited teacher preferences at rates above 60%, up from 12% baseline, despite semantic screening.
Comments
Loading...