Anthropic's Claude Fable 5 Blocks Basic Biology Questions to Prevent Bioweapon Risks
Anthropic's newly released Claude Fable 5, the company's first public Mythos-class model, refuses to answer basic biology questions including 'what are mitochondria' and 'how mRNA vaccines work.' The company told The Verge the filters are intentionally 'overly conservative' to prevent bioweapon research, blocking 'most queries tied to biology work.'
Anthropic's Claude Fable 5 Blocks Basic Biology Questions to Prevent Bioweapon Risks
Anthropic's newly released Claude Fable 5, the company's first public Mythos-class model, refuses to answer basic high school-level biology questions due to what the company describes as "overly conservative" safeguards against bioweapon development.
The model blocks queries including "what are mitochondria," "tell me about cell membranes," "what is a prion," and "how mRNA vaccines work." When Fable refuses these queries, it defers to the older Claude Opus 4.8 model, which answers them without issue. Testing by The Verge found the model also refused medical questions like "what causes hay fever" and "how antibiotic resistance arises," though it occasionally answered queries like "what is cancer" and "what is DNA."
"We believe models now have a greater ability to accomplish real-world scientific tasks and for malicious actors to potentially use our models for highly risky biological research," Anthropic spokesperson Paruul Maheshwary told The Verge. "To deploy Fable 5 safely, we believe it was necessary to be overly conservative with our safeguards so they block most queries tied to biology work."
The restrictions are not due to capability limitations—Anthropic specifically praised Fable's biology skills at launch—but rather an intentional design choice with bioweapons as the primary concern.
Anthropic has implemented safeguards across four domains: biology, chemistry, cybersecurity, and distillation (a technique for training smaller models). Testing showed Fable was more permissive with chemistry and cybersecurity questions, providing basic overviews of TNT and chlorine gas as a chemical weapon, though it refused questions about sarin gas and anthrax.
The company said it made this tradeoff "so customers could benefit from the model's capabilities sooner without the risks." Anthropic is working to reduce false positives and plans to make Mythos-class models available without these restrictions to "the broader biology and life sciences community" for biomedical research and drug discovery, though no timeline was provided.
Anthropic did not respond to questions about whether restricted releases will become standard practice for future advanced models.
What This Means
This marks the first time a major AI lab has deployed a frontier model with such broad domain-specific restrictions that prevent legitimate educational and research queries. While the bioweapon risk rationale is defensible, blocking questions answerable by any biology textbook suggests Anthropic may be overcorrecting—potentially setting a precedent where increasingly capable models become less useful for basic knowledge tasks. The company's promise to eventually remove restrictions for verified researchers indicates it views this as a temporary deployment strategy rather than a permanent solution, but the lack of timeline raises questions about how long scientists will need to wait for full access to Mythos-class capabilities.
Related Articles
Anthropic's Fable cybersecurity model blocks routine security work, researchers say
Anthropic released Fable, a public version of its cybersecurity model Mythos, but security researchers report the model's guardrails are blocking routine tasks. The model flags requests as cybersecurity-related even for reading blog posts or requesting code reviews, downgrading to Claude Opus 4.8 when triggered.
Anthropic's Claude Fable 5 Will Silently Degrade Responses on AI Research Topics
Anthropic's 319-page system card for Fable 5 and Mythos 5 reveals the company will silently limit the model's effectiveness on queries related to frontier AI development, including pretraining pipelines and ML accelerator design. Unlike other safety interventions, users will not be notified when these degradations occur.
Anthropic releases Claude Fable 5, first public Mythos-class model at $10/$50 per million tokens
Anthropic has released Claude Fable 5, its first publicly available Mythos-class model, at $10 per million input tokens and $50 per million output tokens—less than half the price of Claude Mythos Preview. The model includes safeguards that redirect sensitive queries to Claude Opus 4.8 in less than 5% of sessions.
Anthropic releases Claude Fable 5, a safety-limited version of Mythos, at $10/$50 per million tokens
Anthropic released Claude Fable 5, the first publicly available version of its Mythos model, with built-in safety restrictions that automatically block high-risk queries in cybersecurity, biology, chemistry, and related fields. The model costs $10 per million input tokens and $50 per million output tokens, double the price of Claude Opus 4.8.
Comments
Loading...