OpenAI claims reasoning model disproved 80-year-old Erdős conjecture in geometry
OpenAI claims its new reasoning model has produced an original mathematical proof disproving a geometry conjecture first posed by Paul Erdős in 1946. The company says this is the first time AI has autonomously solved a prominent open problem central to a field of mathematics, with verification from mathematicians including Thomas Bloom and Noga Alon.
OpenAI Claims Reasoning Model Disproved 80-Year-Old Erdős Conjecture
OpenAI says its new general-purpose reasoning model has produced an original mathematical proof disproving a geometry conjecture first posed by mathematician Paul Erdős in 1946.
According to OpenAI, the model discovered "an entirely new family of constructions" that outperform what mathematicians believed were the best possible solutions for nearly 80 years. The company claims this marks "the first time AI has autonomously solved a prominent open problem central to a field of mathematics."
Verification and Context
Unlike OpenAI's previous claim in October 2025—when former VP Kevin Weil incorrectly stated GPT-5 had solved 10 Erdős problems, only to discover the solutions already existed in literature—the company this time published supporting statements from multiple mathematicians.
Verification came from:
- Noga Alon
- Melanie Wood
- Thomas Bloom, who maintains the Erdős Problems website
Bloom, who previously called Weil's October post "a dramatic misrepresentation," stated: "AI is helping us to more fully explore the cathedral of mathematics we have built over the centuries."
Technical Significance
OpenAI emphasizes the proof came from a general-purpose reasoning model, not a system specifically designed for mathematical problems. The company says this demonstrates AI systems can now "hold together long, difficult chains of reasoning and connect ideas across fields in ways researchers may not have previously explored."
The specific Erdős problem, model name, benchmark performance, and technical details of the proof were not disclosed in the announcement.
What This Means
If verified through peer review, this would represent a significant milestone in AI-assisted mathematical research—moving from pattern matching existing solutions to genuine novel discovery. The claim's credibility is strengthened by mathematician endorsements and OpenAI's apparent caution after last year's embarrassment. However, the lack of technical details, model specifications, and peer-reviewed publication leaves key questions unanswered. The broader implication: general reasoning models may now be capable of autonomous discovery in physics, biology, and engineering, not just mathematics.
Related Articles
OpenAI releases GPT-5.6 in three tiers with limited government-coordinated rollout
OpenAI announced GPT-5.6, a three-tier model series launching through a limited preview coordinated with the U.S. government. The models—Sol, Terra, and Luna—are priced from $1/$6 to $5/$30 per million input/output tokens and introduce new max and ultra reasoning modes.
AI2 Releases DiScoFormer: Single Transformer Estimates Density and Score Across Distributions Without Retraining
Allen Institute for AI (AI2) has released DiScoFormer, a transformer model that estimates both the density and score of any distribution from a sample in a single forward pass without retraining. In 100 dimensions, the model reduces score estimation error by 6.5x and density error by 37x compared to classical kernel density estimation.
OpenAI previews GPT-5.6 to select partners with three variants priced from $1 to $30 per million tokens
OpenAI has begun previewing its GPT-5.6 series to a limited group of trusted partners after government review. The release includes three variants: Sol at $5 input/$30 output per million tokens, Terra at $2.50/$15, and Luna at $1/$6.
OpenAI restricts GPT-5.6 rollout to government-approved partners, calls arrangement unsustainable
OpenAI released its GPT-5.6 model lineup to a limited group of "trusted partners" after the U.S. government requested restrictions on the rollout. The company released three models—Sol ($5/$30 per million tokens), Terra ($2.50/$15), and Luna ($1/$6)—but said the government-mandated preview "shouldn't become the long-term default."
Comments
Loading...