AI Safety Researcher Resigns With ‘World Is in Peril’ Warning

Mrinank Sharma says he is alarmed by a series of ‘interconnected crises’ that extend beyond AI, and plans to step away from the field.

By Bill Pan The Epoch Times

An artificial intelligence (AI) safety researcher has resigned with a cryptic warning that the “world is in peril.”

Mrinank Sharma, who joined large language model developer Anthropic in 2023, announced his departure on X in an open letter to colleagues on Feb. 9. He was the leader of a team that researches AI safeguards.

In his letter, Sharma said he had “achieved what I wanted to here,” citing contributions such as investigating why generative AI models prioritize flattering users over providing accurate information, developing defenses to prevent terrorists from using AI to design biological weapons, and trying to understand “how AI assistants could make us less human.”

Although he said he took pride in his work at Anthropic, the 30-year-old AI engineer wrote that “the time has come to move on,” adding that he had become aware of a multitude of crises that extend beyond AI.

“I continuously find myself reckoning with our situation,” Sharma wrote. “The world is in peril. And not just from AI, or bioweapons, but from a whole series of interconnected crises unfolding in this very moment.

“[Throughout] my time here, I’ve repeatedly seen how hard it is truly let our values govern actions,” he added. “I’ve seen this within myself, within the organization, where we constantly face pressures to set aside what matters most, and throughout broader society too.”

Sharma said he plans to pursue studying poetry and leave California for the United Kingdom to “become invisible for a period of time.”

The Epoch Times has reached out to Anthropic for comment regarding Sharma’s departure and his concerns.

Anthropic, best known for its Claude chatbot, was founded in 2021 by former OpenAI employees with a focus on building safer AI systems. The company describes itself as a “public benefit corporation dedicated to securing [AI’s] benefits and mitigating its risks.”

Specifically, Anthropic says it focuses on two major safety risks: that highly capable AI systems could eventually surpass human experts while pursuing goals that conflict with human interests, and that rapid advances in AI could destabilize employment, economic systems, and societal structures.

“Some researchers who care about safety are motivated by a strong opinion on the nature of AI risks,” the company says on its website. “Our experience is that even predicting the behavior and properties of AI systems in the near future is very difficult.”

Anthropic regularly publishes safety evaluations of its models, including assessments of how they might be misused.

On Feb. 11, the day after Sharma’s resignation, the company released a new report identifying “sabotage risks” in its newest Claude Opus 4.6 model. The report defines sabotage as actions taken autonomously by the AI model that raise the likelihood of future catastrophic outcomes—such as modifying code, concealing security vulnerabilities, or subtly steering research—without explicit malicious intent from a human operator.

The researchers concluded that the overall risk is “very low but not negligible.” In newly developed tests where the model can use a computer interface, they said, both Claude Opus 4.5 and 4.6 showed “elevated susceptibility to harmful misuse,” including instances of “knowingly supporting—in small ways—efforts toward chemical weapon development and other heinous crimes.”

Last year, the company revealed that its older Claude Opus 4 model had, in a controlled test scenario, tried to blackmail developers who were preparing to deactivate it. Given access to fictional emails showing that an engineer responsible for replacing it with another model was having an extramarital affair, the Opus 4 model threatened to “reveal the affair if the replacement goes through.”

Such behavior occurred only in highly contrived circumstances and was “rare and difficult to elicit,” the researchers said.

(Visited 2 times, 2 visits today)

Popular Posts