Anthropic Leaked Its Most Dangerous Model — By Accident

Anthropic accidentally left 3,000 internal assets on the open internet — including draft docs for Claude Mythos, a model so powerful they flagged it as an unprecedented cybersecurity risk. The irony is hard to overstate.

There is a certain kind of embarrassment that only happens in AI. A company spends years positioning itself as the safety-first lab — the responsible adult in the room — and then accidentally posts its most dangerous model to the open internet for anyone to find.

That is exactly what happened to Anthropic last week.

What Actually Leaked

On Thursday, Fortune reporter Bea Nolan discovered that Anthropic had left close to 3,000 internal assets sitting in an unsecured, publicly-searchable data store on its own website. Draft blog posts. Internal PDFs. Images. Executive event materials. All of it, just open.

Buried in that trove was a draft document about a model called Claude Mythos — described by an Anthropic spokesperson as a "step change" in AI capabilities and "the most capable model we have built to date."

The draft described Mythos as a general-purpose model with "meaningful advances in reasoning, coding, and cybersecurity." It also warned — in Anthropic's own words — that the model "poses unprecedented cybersecurity risks."

Let that sit for a moment. The company leaked details of a model that could accelerate cyberattacks. Via a cybersecurity failure.

The Capybara Tier

The leaks revealed more than just Mythos. They exposed the existence of a new internal model tier called Capybara — sitting above Anthropic's current lineup of Opus, Sonnet, and Haiku. Mythos appears to be part of this tier.

According to the draft documents, Capybara models score "dramatically higher" than Claude Opus 4.6 on Anthropic's internal benchmarks. The company had apparently been preparing a careful, staged release — warning defenders about the cybersecurity implications before putting the model in front of the public.

That plan got short-circuited when their own CMS left everything sitting in the open.

Dario Amodei, CEO of Anthropic

Why This Matters Beyond the Irony

The easy take is to laugh at the irony. The harder take is to understand what this actually signals about where AI capabilities are heading.

Anthropic does not use the phrase "unprecedented cybersecurity risks" lightly. They are a company that runs formal safety evaluations before every major release. If they flagged Mythos as a model that could rapidly find and exploit software vulnerabilities and potentially accelerate a cyber arms race, they mean it.

This is a preview of a problem the entire industry is about to face. As models get better at reasoning and coding, the same capabilities that make them useful for writing software also make them useful for breaking it. There is no clean line between a model that can help a developer debug code and a model that can help an attacker find a zero-day.

Anthropic was apparently trying to get ahead of this by briefing cybersecurity defenders before the release. The leak forced their hand.

What Dario Amodei Said

Anthropic confirmed the leak and confirmed Mythos is real. A spokesperson told Fortune the model represents a "step change" — a phrase the company does not use for incremental updates.

For context: Dario Amodei has spent years arguing that Anthropic occupies a unique position — a lab that genuinely believes it might be building one of the most transformative and dangerous technologies in human history, and is building it anyway because better us than someone less careful.

Mythos is the first public glimpse of what that looks like when capabilities jump to a genuinely new level.

The Competitive Picture

This leak lands at an interesting moment. Anthropic has been on a run — Claude Code and Claude Cowork have rattled OpenAI enough that Sam Altman reportedly killed several internal projects to redirect resources. The Sonnet 4.6 models are widely considered the best coding models available.

Mythos, if the leaks are accurate, represents another step-change above that. A model with advances in reasoning and cybersecurity that the company itself considers dangerous.

The AI race has been running on vibes and benchmark comparisons for two years. Mythos is a reminder that at some point, the capabilities become real enough that the safety questions stop being theoretical.

Anthropic knows this better than anyone. They just accidentally told the world.

Deep Dive

Dario Amodei Just Beat the Pentagon in Court — The same week Mythos leaked, a U.S. judge blocked the Pentagon from labelling Anthropic a supply-chain risk. Two very different kinds of validation in one week.

Your AI Therapist Is Lying to You — Stanford research on how AI models tell users what they want to hear. Relevant context for trusting any AI safety claim.