Claude Fable 5: Anthropic releases a safe version of Claude Mythos

Anthropic has released Claude Fable 5, a publicly available version of its powerful but previously restricted Mythos model — complete with a new set of safety guardrails designed to keep its most dangerous capabilities out of the wrong hands. Along with this "safe for general use" model, Anthropic also released Claude Mythos 5, a version of Fable without the safety guardrails, to trusted testing partners.
Earlier this year, Anthropic announced a limited launch of Claude Mythos, a new model with advanced cybersecurity capabilities that Anthropic deemed too dangerous to release.
The company says Fable 5 is the most capable model it has ever made generally available, leading nearly all tested benchmarks across software engineering, knowledge work, vision, and scientific research. The more complex the task, Anthropic says, the wider Fable 5's edge over its previous models and competitors.
Fable 5 shares the same underlying architecture as Claude Mythos 5 — the restricted version shared with cybersecurity partners through Project Glasswing — but ships with classifiers that intercept sensitive queries and route them to Claude Opus 4.8 instead. The restricted categories include cybersecurity, biology, and chemistry, as well as attempts to distill the model's capabilities for use in competing systems.
Anthropic says fewer than five percent of sessions trigger a fallback, though it acknowledges the system is tuned conservatively and will occasionally flag benign requests.
How to try Claude Fable 5
Fable 5 is available today across all Claude plans and via the API using the model string claude-fable-5. It is priced at $10 per million input tokens and $50 per million output tokens — less than half the cost of Claude Mythos Preview. Subscription plan users get access at no extra cost through June 22, after which usage credits will be required.
Benchmarks
In agentic coding evaluations, Fable 5 outpaced GPT-5.5 and Claude Opus 4.8 by significant margins, according to Anthropic. The company's data shows that it even outperforms Claude Mythos on some key benchmarks.
In a blog post, Anthropic wrote that fintech company Stripe, which had early access to Fable 5, reported that the model completed a full migration of a 50-million-line Ruby codebase in a single day. Anthropic estimated that this work would have taken a full engineering team more than two months.
Fable 5, Mythos 5, and safety
The safety story here is genuinely complicated. Anthropic spent months warning that Mythos-class models were too dangerous for general release. As recently as May, the company publicly acknowledged that adequate safeguards didn't yet exist, per prior Mashable reporting.
Fable 5 is its answer to that problem, but the company's own disclosures suggest the solution is still a work in progress. An external bug bounty ran more than 1,000 hours of testing without producing a universal jailbreak — but the UK AI Safety Institute made early inroads toward one in a brief initial window. Anthropic frames that as acceptable risk. Others may disagree.
The Fable 5 system card states that the model has similar performance to Claude Opus 4.8 and other recent models on misaligned behaviors such as hallucination, dishonesty, and sycophancy.
from Mashable https://ift.tt/cDu7i6s
via IFTTT