Anthropic, an AI company that’s practically built its reputation on being the most safety-conscious player in the game, is actually ditching the core promise of its flagship safety policy, as reported by Time. This is a massive backpedal for a company that has always put safety first, and it’s definitely going to get people talking.
Back in 2023, Anthropic made a really bold promise: it wouldn’t train any new AI system unless it could guarantee that its safety measures were totally adequate beforehand. For years, its leaders, like co-founder Jared Kaplan, really championed this commitment. It was the central pillar of their Responsible Scaling Policy (RSP), and it made them look like the responsible adults in the room, especially when everyone else seemed to be in a mad dash for more powerful AI.
But now, they’ve decided to completely overhaul that RSP. That means scrapping the promise to not release AI models if they can’t guarantee proper risk mitigations in advance. When asked about this, Anthropic’s Chief Science Officer, Jared Kaplan, said that stopping AI model training wouldn’t actually help anyone. He mentioned that with the rapid advancement of AI, it didn’t make sense for them to make unilateral commitments while competitors were “blazing ahead.”
The new version of the policy commits to being more transparent about AI safety risks, including giving us more details on how Anthropic’s own models perform in safety tests
It also promises to match or even surpass the safety efforts of its competitors. It says it will “delay” AI development if the company thinks it’s leading the AI race and the risks of a catastrophe are significant. Overall, this change leaves Anthropic way less constrained by its own safety policies, which previously straight-up barred them from training models above a certain level without proper safety in place.
This shift comes at a really interesting time for Anthropic. They were once seen as a bit behind OpenAI, but they’ve had a string of huge technological and commercial successes lately. Their Claude models, especially Claude Code for software writing, have gained a ton of fans.
In February, they scooped up an incredible $30 billion in new investments, valuing the company at around $380 billion, and their revenue is growing at an insane 10x per year. Plus, their business model of selling directly to businesses is seen by many investors as more stable than OpenAI’s focus on consumers. Kaplan denies that this decision is a capitulation to market incentives, framing it as a pragmatic response to new realities.
When Anthropic first introduced the RSP in 2023, Kaplan hoped it would inspire rivals to adopt similar measures. While no one made quite the same overt promise to pause AI development, many did publish lengthy risk mitigation reports, which Kaplan sees as Anthropic having a positive influence. They also hoped it would become a blueprint for national or even international regulations.
But, unfortunately, those regulations never really happened. The Trump Administration has actually endorsed a “let-it-rip” attitude toward AI development, even trying to nullify state regulations, and there’s no federal AI law in sight. A global governance framework, which seemed possible a few years ago, now feels like a closed door. Meanwhile, the race for AI supremacy, both between companies and nations, has only gotten more intense.
When asked if Anthropic was caving to market pressure, Kaplan insisted they’re making a renewed commitment to developing AI safely. He said that if competitors are doing the right thing, Anthropic is committed to doing as well or better. However, he doesn’t think it makes sense for them to stop engaging with AI research and safety, and potentially lose relevance, especially if others are moving ahead and they aren’t actually adding additional risk to the ecosystem.
Published: Feb 25, 2026 02:30 pm