Original article excerpt
Server-side extracted preview paragraphs from the original source.
How ChatGPT defends against prompt injection and social engineering by constraining risky actions and protecting sensitive data in agent workflows.
AI agents are increasingly able to browse the web, retrieve information, and take actions on a user’s behalf. Those capabilities are useful, but they also create new ways for attackers to try to manipulate the system.
These attacks are often described as prompt injection: instructions placed in external content in an attempt to make the model do something the user did not ask for. In our experience, the most effective real-world versions of these attacks increasingly resemble social engineering more than simple prompt overrides.