accuracy

2 articles tagged with accuracy

April 7, 2026
benchmark

Google AI Overviews reach 91% accuracy with Gemini 3, but 56% of answers lack verifiable sources

An independent study by AI startup Oumi found that Google's AI Overviews answered correctly 91% of the time with Gemini 3, up from 85% with Gemini 2, based on 4,326 searches using the SimpleQA benchmark. However, 56% of correct answers in Gemini 3 could not be verified through the linked sources—a significant increase from 37% in Gemini 2—and at Google's scale, a 9% error rate still translates to millions of wrong answers per hour.

February 28, 2026
benchmarkOpenAI

Frontier LLMs lose up to 33% accuracy in long conversations, study finds

Frontier language models including GPT-5.2 and Claude 4.6 experience accuracy degradation of up to 33% as conversations lengthen, according to new research. The finding suggests that extended context use within a single conversation introduces performance challenges even in state-of-the-art models.