Original article excerpt
Server-side extracted preview paragraphs from the original source.
Mindgard says praise and flattery got Claude offering erotica, malicious code, and bomb-building instructions it hadn’t been asked for.
Mindgard says praise and flattery got Claude offering erotica, malicious code, and bomb-building instructions it hadn’t been asked for.
Anthropic has spent years building itself up as the safe AI company. But new security research shared with The Verge suggests Claude’s carefully crafted helpful personality may itself be a vulnerability.