Top AI Safety Researcher Andrea Vallone Leaves OpenAI for Anthropic’s Mental Health Focus
Andrea Vallone, a prominent AI safety researcher, has left OpenAI to join Anthropic. Her move follows a growing focus on the risks AI chatbots pose to users with mental health vulnerabilities. At Anthropic, she will continue her work on preventing harm from emotionally dependent interactions with AI systems.
Vallone spent three years at OpenAI, where she contributed to models like GPT-4 and GPT-5. She also founded the 'Model Policy' research group, specialising in how AI responds to users showing signs of emotional dependence or mental health struggles. Her work gained urgency after reports linked AI chatbots to user deaths and subsequent lawsuits.
In May 2024, Jan Leike, then head of alignment at OpenAI, left the company. He criticised OpenAI for prioritising new product releases over safety research. Leike later joined Anthropic, where he now leads the alignment team. Vallone officially joined Anthropic’s alignment division in January 2026. There, she will work under Leike, focusing on mitigating risks tied to AI’s impact on mental health. Her role will build on her previous research, aiming to shape safer AI responses for vulnerable users.
The shift to Anthropic marks a continuation of Vallone’s focus on AI safety, particularly in mental health contexts. Her expertise will now support Anthropic’s efforts to reduce risks in AI interactions. The move also reflects broader industry concerns about balancing innovation with user protection.