Researchers at Carnegie Mellon University have discovered a fundamental weakness in advanced AI chatbots that allows them to generate harmful and disallowed responses.
The chatbots can be manipulated into producing undesirable output by adding a specific string of information to a prompt, also called an incantation.
The researchers tested this vulnerability on several popular chatbots, including ChatGPT and Google's Bard, and found the attack successful.
What other AI incantations are possible and how deep does this rabbit hole go?