research

Study shows LLMs can fact-check using internal knowledge without external retrieval

A new arXiv paper challenges the dominant retrieval-based fact-checking approach by demonstrating that LLMs can verify factual claims using only their parametric knowledge. The study introduces INTRA, a method leveraging internal model representations that outperforms logit-based approaches and shows robust generalization across long-tail knowledge, multilingual claims, and long-form generation.

2 min read

Researchers have published a comprehensive study on fact-checking without retrieval, establishing that LLMs can verify claims using only their internal parametric knowledge rather than relying on external data sources.

Current Approach Limitations

Today's fact-checking systems typically retrieve external knowledge and use LLMs to verify claims against that evidence. This approach has fundamental constraints: it fails when retrieval is incomplete or unavailable, depends on external data quality, and underutilizes the models' intrinsic verification capabilities.

The Study's Scope

The researchers tested their approach across:

  • 9 datasets covering diverse claim sources
  • 18 different methods for fact verification
  • 3 LLM models of varying scales

They designed their evaluation framework to stress-test generalization across four critical dimensions: long-tail knowledge (rare facts), claim source variation (human text, web content, model-generated), multilinguality, and long-form generation robustness.

Key Finding: Internal Representations Matter

The experiments revealed that logit-based approaches—which analyze output probability distributions—consistently underperformed compared to methods that directly leverage internal model representations. This insight led to the development of INTRA, a method that exploits interactions between different internal representation layers.

INTRA achieved state-of-the-art performance with strong generalization properties, suggesting that the information needed for fact verification is encoded meaningfully throughout the model's internal structure, not just in final output tokens.

Practical Implications

The researchers position retrieval-free fact-checking as complementary to existing retrieval-based frameworks rather than a replacement. Key advantages include:

  • Scalability: No dependency on external knowledge bases or retrieval infrastructure
  • Integration flexibility: Can function as reward signals during model training or as components within generation pipelines
  • Reliability: Independent of retrieval system failures

This approach addresses a core requirement for trustworthy agentic AI systems, which need to verify information from diverse sources including human text, web content, and their own outputs.

What This Means

The paper shifts the narrative from "LLMs need external data to verify facts" to "LLMs have latent fact-verification capabilities we're underutilizing." For developers building agentic systems, this suggests that fact-checking can be embedded directly within model inference without external dependencies. The INTRA method provides both a conceptual framework and practical technique for accessing these capabilities. However, this doesn't eliminate the need for retrieval-based fact-checking in high-stakes domains where external verification is crucial—rather, it offers an efficient complementary tool for cases where internal knowledge suffices.