Forgot password
Enter the email address you used when you joined and we'll send you instructions to reset your password.
If you used Apple or Google to create your account, this process will create a password for your existing account.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Reset password instructions sent. If you have an account with us, you will receive an email within a few minutes.
Something went wrong. Try again or contact support if the problem persists.
Photo by Anthropic, licensed under CC BY-SA 2.0

Anthropic’s terrifying backpedal on its core AI safety commitment leaves experts reeling

Capitulation.

Anthropic, an AI company that’s practically built its reputation on being the most safety-conscious player in the game, is actually ditching the core promise of its flagship safety policy, as reported by Time. This is a massive backpedal for a company that has always put safety first, and it’s definitely going to get people talking.

Recommended Videos

Back in 2023, Anthropic made a really bold promise: it wouldn’t train any new AI system unless it could guarantee that its safety measures were totally adequate beforehand. For years, its leaders, like co-founder Jared Kaplan, really championed this commitment. It was the central pillar of their Responsible Scaling Policy (RSP), and it made them look like the responsible adults in the room, especially when everyone else seemed to be in a mad dash for more powerful AI.

But now, they’ve decided to completely overhaul that RSP. That means scrapping the promise to not release AI models if they can’t guarantee proper risk mitigations in advance. When asked about this, Anthropic’s Chief Science Officer, Jared Kaplan, said that stopping AI model training wouldn’t actually help anyone. He mentioned that with the rapid advancement of AI, it didn’t make sense for them to make unilateral commitments while competitors were “blazing ahead.”

The new version of the policy commits to being more transparent about AI safety risks, including giving us more details on how Anthropic’s own models perform in safety tests

It also promises to match or even surpass the safety efforts of its competitors. It says it will “delay” AI development if the company thinks it’s leading the AI race and the risks of a catastrophe are significant. Overall, this change leaves Anthropic way less constrained by its own safety policies, which previously straight-up barred them from training models above a certain level without proper safety in place.

This shift comes at a really interesting time for Anthropic. They were once seen as a bit behind OpenAI, but they’ve had a string of huge technological and commercial successes lately. Their Claude models, especially Claude Code for software writing, have gained a ton of fans.

In February, they scooped up an incredible $30 billion in new investments, valuing the company at around $380 billion, and their revenue is growing at an insane 10x per year. Plus, their business model of selling directly to businesses is seen by many investors as more stable than OpenAI’s focus on consumers. Kaplan denies that this decision is a capitulation to market incentives, framing it as a pragmatic response to new realities.

When Anthropic first introduced the RSP in 2023, Kaplan hoped it would inspire rivals to adopt similar measures. While no one made quite the same overt promise to pause AI development, many did publish lengthy risk mitigation reports, which Kaplan sees as Anthropic having a positive influence. They also hoped it would become a blueprint for national or even international regulations.

But, unfortunately, those regulations never really happened. The Trump Administration has actually endorsed a “let-it-rip” attitude toward AI development, even trying to nullify state regulations, and there’s no federal AI law in sight. A global governance framework, which seemed possible a few years ago, now feels like a closed door. Meanwhile, the race for AI supremacy, both between companies and nations, has only gotten more intense.

When asked if Anthropic was caving to market pressure, Kaplan insisted they’re making a renewed commitment to developing AI safely. He said that if competitors are doing the right thing, Anthropic is committed to doing as well or better. However, he doesn’t think it makes sense for them to stop engaging with AI research and safety, and potentially lose relevance, especially if others are moving ahead and they aren’t actually adding additional risk to the ecosystem.


Attack of the Fanboy is supported by our audience. When you purchase through links on our site, we may earn a small affiliate commission. Learn more about our Affiliate Policy
More Stories To Read
Author