Anthropic Claude Mythos Leak 2026: What We Know

March 28, 2026

Anthropic Claude Mythos Leak 2026: What We Know

TL;DR

Anthropic accidentally exposed details of Claude Mythos — its most powerful AI model to date — through a misconfigured content management system. The model introduces a new "Capybara" tier above Opus and has been described internally as a "step change" in reasoning, coding, and cybersecurity capabilities. Cybersecurity stocks dropped sharply on fears the model could exploit vulnerabilities faster than defenders can patch them. Anthropic is restricting early access to cyber defense organizations before a broader release.

What You'll Learn

In this post you will learn how the Anthropic data leak happened, what the leaked documents reveal about Claude Mythos and the new Capybara model tier, why cybersecurity stocks fell after the news broke, what Anthropic's Responsible Scaling Policy means for powerful models like this, and what this incident signals about the broader AI safety conversation.


How the Leak Happened

On March 26, 2026, security researchers Roy Paz of LayerX Security and Alexandre Pauwels of the University of Cambridge discovered a publicly accessible data store connected to Anthropic's blog. The data store contained close to 3,000 assets — including draft blog posts, internal documents, and details of an invite-only CEO retreat — that had never been published on Anthropic's public-facing sites.1

The root cause was a content management system (CMS) misconfiguration. Assets uploaded to the CMS data store were public by default unless explicitly set to private. An Anthropic spokesperson attributed the exposure to "human error in the CMS configuration" and stressed the issue was "unrelated to Claude, Cowork, or any Anthropic AI tools."1

After Fortune contacted Anthropic about the exposed data, the company removed public access to the data store. But by then, the core details of an unreleased model had already been reported.

What Is Claude Mythos?

Claude Mythos is the first model in a new tier called Capybara. Just as Anthropic's existing model hierarchy uses Haiku (fast and lightweight), Sonnet (balanced), and Opus (most capable), Capybara sits above Opus as a new performance ceiling.2

According to the leaked draft blog post, Anthropic describes Mythos as "by far the most powerful AI model we've ever developed" and considers it a "step change" in capabilities.1 The leaked documents indicate the model scores dramatically higher than Claude Opus 4.6 across several benchmark categories:

  • Software coding — significantly improved code generation and analysis
  • Academic reasoning — enhanced multi-step problem solving
  • Cybersecurity — described as "currently far ahead of any other AI model in cyber capabilities"3

Specific benchmark numbers have not been publicly released. Anthropic has not disclosed Mythos's parameter count, context window size, or detailed architecture. What we know comes exclusively from leaked draft materials and Anthropic's subsequent confirmation that the model exists and represents its most capable system to date.

For context on how Anthropic's previous flagship performs: Claude Opus 4.6 scored 65.4% on Terminal-Bench 2.0, edging past GPT-5.2-Codex's 64.7% at the time — though GPT-5.3-Codex has since taken the lead at 77.3%.4

Why Cybersecurity Stocks Fell

The most market-moving detail in the leak was the cybersecurity assessment. The draft blog post reportedly warned that Mythos is "far ahead of any other AI model in cyber capabilities" and that it "presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders."3

Leaked internal documents warned the model could significantly heighten cybersecurity risks by rapidly finding and exploiting software vulnerabilities, potentially accelerating a cyber arms race.5

The market reaction on March 27 was swift. CrowdStrike fell roughly 7%, Palo Alto Networks dropped about 6%, and Zscaler declined around 4.5%. Cloudflare shed approximately 3.4%. The Global X Cybersecurity ETF (BUG) fell over 4%, setting a new 12-month low.6

The sell-off reflected a specific fear: if AI models can discover and chain software vulnerabilities autonomously, the defensive cybersecurity industry — which relies on staying ahead of attackers — faces a structural challenge. The concern is not that current tools become useless overnight, but that the offense-defense asymmetry in cybersecurity could tilt further toward attackers as these models become available.

Anthropic's Release Strategy

Rather than rushing to ship Mythos broadly, Anthropic is taking a staged approach. The company said it is working with "a small group of early access customers" to test the model, with priority given to cyber defense organizations.3

This strategy aligns with Anthropic's Responsible Scaling Policy (RSP), which assigns AI Safety Levels (ASL) to models based on their potential for catastrophic misuse. Anthropic activated ASL-3 protections in May 2025 for models that "substantially increase" risks beyond what non-AI tools like search engines could provide.7

ASL-3 safeguards include increased internal security to protect model weights from theft and deployment restrictions designed to limit misuse in areas like chemical, biological, radiological, nuclear, and — critically for Mythos — cyber threats. The most recent version of the RSP, effective February 24, 2026, introduced structured Capability Reports and Safeguard Reports to give decision-makers a more complete risk picture.7

Whether Mythos triggers additional protections beyond ASL-3 has not been disclosed. Anthropic has described ASL-4 criteria only in broad terms: models that become the primary source of national security risk in a major area or that demonstrate autonomous replication capabilities.7

The Irony of the Leak

Multiple commentators noted the irony of an AI safety company — one that regularly emphasizes responsible deployment and security — accidentally leaking its most sensitive model details through a basic CMS misconfiguration.8

The incident also exposed details of a planned invite-only CEO retreat at an 18th-century English countryside manor, where Anthropic CEO Dario Amodei was scheduled to present the model to prospective enterprise customers.1 This detail, while tangential to the technical story, drew attention to Anthropic's growing enterprise sales push as the company reportedly considers an IPO as early as October 2026 at a valuation that could exceed $60 billion.9

What This Means for the AI Industry

The Mythos leak raises questions that extend beyond Anthropic.

The capability-safety tension is becoming concrete. For years, AI safety discussions have been somewhat abstract — focused on hypothetical risks from future models. Mythos represents a case where the model's creator itself warns of "unprecedented cybersecurity risks" from a system it built. This is not an external critic raising alarms; it is the developer's own internal assessment.

Staged releases may become the norm for frontier models. Anthropic's decision to restrict early access to cyber defense organizations before broader release sets a precedent. If other labs adopt similar approaches for models with dual-use capabilities, the gap between training completion and public availability could widen significantly.

The offense-defense balance in cybersecurity is shifting. AI models that can rapidly discover and exploit vulnerabilities do not just threaten individual organizations — they challenge the economic model of the entire cybersecurity industry. If AI-powered cybersecurity tools cannot keep pace with AI-powered attacks, the industry may need to rethink its approach from the ground up. For a deeper look at how AI is reshaping security operations, the shift from reactive to proactive defense is already underway.


Footnotes

  1. Fortune, "Anthropic left details of an unreleased model, an upcoming exclusive CEO event, in a public database," March 26, 2026. 2 3 4

  2. Apiyi.com, "What is Claude Mythos? A Full Analysis of Anthropic's Strongest AI Model Leak," March 2026.

  3. Fortune, "Anthropic 'Mythos' AI model representing 'step change' in power revealed in data leak," March 26, 2026. 2 3

  4. Terminal-Bench 2.0 benchmark results, published prior to the Mythos leak.

  5. Fortune, "Anthropic leaked details of a new AI model that poses unprecedented cybersecurity risks," March 27, 2026.

  6. CNBC, "Cybersecurity stocks fall on report Anthropic is testing a powerful new model," March 27, 2026.

  7. Anthropic, "Responsible Scaling Policy Version 3.0," effective February 24, 2026. 2 3

  8. Futurism, "Anthropic Just Leaked Upcoming Model With 'Unprecedented Cybersecurity Risks' in the Most Ironic Way Possible," March 2026.

  9. The Information, reporting on Anthropic's IPO considerations, March 2026; Anthropic closed a funding round at a $380 billion valuation in February 2026.

Frequently Asked Questions

Claude Mythos is Anthropic's unreleased AI model, described as a "step change" beyond Claude Opus 4.6. It is the first model in a new Capybara tier that sits above the existing Opus tier in Anthropic's model hierarchy.

FREE WEEKLY NEWSLETTER

Stay on the Nerd Track

One email per week — courses, deep dives, tools, and AI experiments.

No spam. Unsubscribe anytime.