Anthropic Unveils Claude Fable 5: Benchmark Wins and Games You Can Play Now

Anthropic’s Claude Fable 5 and Mythos 5 launch with benchmark‑leading performance across software engineering, knowledge work, vision and long‑context tasks, safety‑graded access, and live demos that generate full video games from a single prompt, while pricing and phased rollout are detailed.

AI Engineering
AI Engineering
AI Engineering
Anthropic Unveils Claude Fable 5: Benchmark Wins and Games You Can Play Now

Release Overview

Anthropic released Claude Fable 5 and Mythos 5, the first Mythos‑level models open to regular users. Both share the same underlying model; Fable 5 includes safety classifiers that downgrade risky requests to Opus 4.8, while Mythos 5 provides full capability to vetted security and biotech researchers.

Pricing: $10 per million input tokens and $50 per million output tokens, less than half the previous Mythos Preview price.

Benchmark Comparison

Fable 5 benchmark comparison
Fable 5 benchmark comparison

Software Engineering

Stripe used Fable 5 to migrate a 50‑million‑line Ruby codebase in one day, a task that previously required two months of manual work.

In Cognition’s FrontierCode test, Fable 5 achieved the highest score among frontier models and demonstrated higher token efficiency than earlier Claude models.

Early feedback from GitHub, Cursor and Cognition teams noted that Fable 5 can handle long‑duration complex coding tasks with higher autonomy and reliability.

Knowledge Work

On Hebbia’s senior‑finance reasoning benchmark, Fable 5 obtained the top score across all models, improving document reasoning, chart interpretation and problem solving.

IMC, a trading firm, reported high scores for Fable 5 across fact‑checking, conceptual reasoning, root‑cause analysis and expected‑value calculations.

Vision

Fable 5 extracts precise numbers from detailed scientific charts and can reconstruct the full source code of a web app from a screenshot.

Fable 5 can complete Pokémon FireRed using only a raw game screenshot, without any map or state information.

Long Context & Memory

Fable 5 maintains focus over multi‑million‑token tasks and can use persistent notes to improve output. In a Slay the Spire test, persistent‑file memory boosted performance threefold over Opus 4.8, tripling the number of final‑level completions.

Build a solar‑system model from first‑principles physics and predict eclipses.

Autonomously play “Satisfactory”, planning and constructing an automated factory.

Design a 3D‑printable model in a browser‑based CAD editor, with the editor itself written by Fable 5.

Implement fluid‑simulation code synchronized to an EDM beat, with the music generated by Fable 5.

Life‑Science Research (Mythos 5)

Mythos 5 has been used internally for drug‑discovery pipelines. Protein‑design experts reported a ten‑fold speedup in routine tasks and comparable quality to expert researchers. Out of 14 test protein targets, 9 met drug‑development criteria and entered further study.

In blind tests, scientists preferred Mythos 5‑generated molecular‑biology hypotheses 80 % of the time; one novel E. coli protein mechanism was later confirmed by an independent preprint.

Real‑World Demonstrations

Ethan Mollick (University of Pennsylvania) benchmarked Fable 5, noting it outperformed all publicly available models and sustained 12‑hour, multi‑page task execution.

Using Claude Code, he generated complete video games from a single prompt:

Snake – a faithful recreation of the classic arcade game.

Strata – an endless‑cave lantern‑lighting game reminiscent of “Myst Island”.

Duino – a game inspired by Rainer Maria Rilke’s “Duino Elegies”, displaying verses as the player moves.

He also generated an isochronic transit map visualizing commute times between any two points, with notable accuracy.

Safety Mechanism Design

Mythos‑level models present dual‑use risks in cybersecurity and biochemistry. Anthropic added three safety classifiers to Fable 5 that downgrade risky requests to Opus 4.8.

Over 95 % of user sessions trigger no downgrade; less than 5 % are downgraded, though some benign requests are mistakenly blocked.

Cybersecurity : blocks offensive security tasks; red‑team testing and >1,000 hours of bug‑bounty testing found no universal jailbreak.

Biochemistry : intercepts most bio‑related queries; Mythos‑level models already predict viral capsid assembly more accurately than specialized protein‑language models.

Model Distillation : prevents large‑scale extraction of Fable 5 capabilities for training competing models.

Anthropic retains API customer data for 30 days for security audit only, then deletes it.

Alignment tests show error‑behavior probabilities for Fable 5 and Mythos 5 match Opus 4.8; detailed results are in the official system card.

Access Plan

Claude Fable 5 is publicly available now. Mythos 5 is limited to Project Glasswing security partners and, later, vetted biotech researchers.

Subscription users receive phased access: free use for Pro/Max/Team plans from June 9‑22, then usage‑based billing from June 23, with plans to reintegrate Fable 5 into regular subscription once capacity allows.

All API users can call the claude-fable-5 model.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

software engineeringlarge language modelGenerative AIAI safetyClaudeAI benchmarksFable 5
AI Engineering
Written by

AI Engineering

Focused on cutting‑edge product and technology information and practical experience sharing in the AI field (large models, MLOps/LLMOps, AI application development, AI infrastructure).

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.