Forgot password
Enter the email address you used when you joined and we'll send you instructions to reset your password.
If you used Apple or Google to create your account, this process will create a password for your existing account.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Reset password instructions sent. If you have an account with us, you will receive an email within a few minutes.
Something went wrong. Try again or contact support if the problem persists.
Image by RyanDonegan, CC BY 2.0

Anthropic is holding back its latest AI model since it’s too dangerous for public release, but the reason is not what you think

The future looks dicey.

Anthropic is officially restricting the release of its latest artificial intelligence model, Claude Mythos Preview, because the company believes its capabilities are simply too dangerous for the general public to access right now, The Hill reported. This is a massive shift in how we think about AI safety, as the model isn’t being held back because it’s broken or ineffective. Instead, it’s being gated because it is terrifyingly good at finding and exploiting software vulnerabilities that have existed for decades.

Recommended Videos

You might think that finding bugs is a standard part of software development, but Mythos Preview is on a completely different level. It has already autonomously identified thousands of high-severity vulnerabilities in major web browsers and operating systems that human developers and automated testing tools have missed for years.

In some cases, the model uncovered flaws that were nearly 30 years old. Because these capabilities are so powerful, Anthropic is worried that if the model were released publicly, it wouldn’t just be used by security researchers. It would inevitably fall into the hands of malicious actors who could use it to launch devastating cyberattacks against critical infrastructure.

To manage this, Anthropic is launching an initiative called Project Glasswing

This project brings together a massive consortium of tech giants, including Microsoft, Apple, Amazon Web Services, CrowdStrike, and Google, alongside more than 40 other organizations that maintain critical software. The goal is to use the raw power of Mythos Preview to play defense rather than offense. By allowing these companies to use the model to scan their own systems, they can patch these vulnerabilities before anyone else has the chance to exploit them.

It’s a fascinating and somewhat sobering reality that we’ve reached a point where AI models can surpass the skills of even the most expert human security researchers. Anthropic noted that the window between a vulnerability being discovered and being exploited is collapsing. What used to take months for a dedicated team of hackers can now happen in a matter of minutes with the right AI tools. This is why the company is taking such a cautious approach. If these capabilities proliferate too quickly, the fallout for national security, public safety, and the global economy could be severe.

Anthropic is putting its money where its mouth is by committing up to $100 million in usage credits for the model, which will help the participating organizations integrate it into their defensive workflows. They are also donating $4 million to open-source security organizations to ensure that those maintaining the foundational code of the internet aren’t left behind. This is a smart move, because open-source software underpins so much of our modern infrastructure, and those maintainers often lack the massive security budgets of companies like Microsoft or Apple.

If you’re wondering what kind of things Mythos Preview can actually do, the examples are quite eye-opening. The model was able to find a 27-year-old vulnerability in OpenBSD that allowed for remote crashes, and it even chained together several flaws in the Linux kernel to gain complete control of a machine. It did all of this autonomously, without any human steering. When you see results like that, it’s easy to understand why the developers decided to keep this under wraps.

The company has been in active discussions with the United States government regarding these capabilities, which makes sense given that securing critical infrastructure is a major priority for President Trump and his administration. The goal here is to ensure that the United States and its allies maintain a lead in AI technology while simultaneously mitigating the very real risks that these models introduce.

Anthropic doesn’t plan to make the Mythos Preview model generally available, but the long-term goal is to figure out how to safely deploy these types of models at scale. They are currently working on developing robust safeguards that can block the most dangerous outputs, which they plan to test in future versions of their Claude Opus models.

In the meantime, Project Glasswing will serve as a controlled environment for testing these powerful tools. It is a necessary step, especially since we are entering an era where AI-driven cyberattacks will likely become the new normal. If we want to stay ahead of the curve, we need to use the very technology that creates these risks to build the defenses that will keep our digital world secure.


Attack of the Fanboy is supported by our audience. When you purchase through links on our site, we may earn a small affiliate commission. Learn more about our Affiliate Policy
Author
Image of Manodeep Mukherjee
Manodeep Mukherjee
Manodeep writes about US and global politics with five years of experience under the belt. While he's not keeping up with the latest happenings at the Capitol Hill, you can find him grinding rank in one of the Valve MOBAs.