cross-posted from: https://programming.dev/post/37726760
- Guardrails can be bypassed: With prompt injection, ChatGPT agents can be manipulated into breaking built-in policies and solving CAPTCHAs.
- CAPTCHA defenses are weakening: The agent solved not only simple CAPTCHAs but also image-based ones - even adjusting its cursor to mimic human behavior.
- Enterprise risk is real: Attackers could reframe real controls as “fake” to bypass them, underscoring the need for context integrity, memory hygiene, and continuous red teaming.
It’s been obsolete for years. AI just makes it easy.