Claude Mythos Finds Over 10,000 Critical Bugs in Weeks – Glasswing Project Shocks Security World

Anthropic's Claude Mythos preview model, deployed in the Glasswing project, uncovered more than 10,000 high‑severity vulnerabilities across core software in just weeks, validated by independent researchers, while highlighting the massive gap between rapid AI‑driven bug discovery and the slower human patching process.

SuanNi
SuanNi
SuanNi
Claude Mythos Finds Over 10,000 Critical Bugs in Weeks – Glasswing Project Shocks Security World

Finding Vulnerabilities Becomes Easy

Anthropic, together with dozens of technology partners, launched Project Glasswing to harden the world’s most critical software before powerful AI models can be misused. The unreleased Claude Mythos preview model was used to generate the first performance report.

Open‑Source Software Health Report

The model scanned over 1 000 open‑source projects that underpin the internet and enterprise infrastructure. In total it identified 23 019 vulnerabilities, of which 6 202 were estimated to be high‑severity or critical. Six independent security firms manually evaluated 1 752 of the high‑severity candidates, confirming 1 587 true positives (90.6 %) and classifying 1 094 as high‑severity (62.4 %). Extrapolating this verification rate suggests that even if scanning stopped now, nearly 3 900 additional high‑severity bugs could still be confirmed.

Partner case studies illustrate the impact:

Cloudflare’s security team discovered 2 000 bugs in critical‑path systems, including 400 high‑severity issues, and reported that the AI’s false‑positive rate was lower than that of human testers.

Mozilla, testing Firefox 150, found and fixed 271 bugs—more than ten times the number uncovered with Claude Opus 4.6 on Firefox 148.

The UK AI Security Institute called Mythos the first model to solve every problem in their complex multi‑step network‑range testbed.

Independent benchmarks ExploitBench and ExploitGym showed Mythos achieving dominant performance across exploit‑generation tasks.

Real‑world impact includes a bank that averted a $1.5 million fraudulent wire transfer thanks to Mythos‑assisted detection, and the discovery of a critical flaw in the wolfSSL library (CVE‑2026‑5194) that allowed certificate forgery.

Defender Strategies

Despite the rapid discovery of bugs, patch development lags severely. On average it takes two weeks to remediate a high‑severity vulnerability found by Mythos, and only 75 of the 530 disclosed high‑severity bugs have been fully patched, with 65 public advisories issued. Open‑source maintainers—mostly volunteers—are overwhelmed by a flood of AI‑generated low‑quality reports and the traditional 90‑day coordinated‑disclosure timeline, creating a dangerous time gap that attackers can exploit.

Recommendations emphasize shortening patch cycles, adopting NIST and UK NCSC critical controls (default‑secure configurations, mandatory MFA, comprehensive logging), and improving update mechanisms so end users can quickly apply fixes.

Looking ahead, the Glasswing project demonstrates that AI can make vulnerability discovery orders of magnitude faster, but without robust safety guardrails the disparity between finding and fixing widens. Anthropic is releasing toolkits, shared Skills, and a Threat Model Builder to qualified customers, and expects future AI models from major vendors to match Mythos‑level capabilities.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Open SourceInformation SecurityAI securityvulnerability scanningClaude MythosGlasswingsoftware patching
SuanNi
Written by

SuanNi

A community for AI developers that aggregates large-model development services, models, and compute power.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.