Microsoft's new ChatGPT-powered Bing Chat is still in a limited preview, but those with access have already prompted it to reveal its codename, the rules governing its responses -- and apparently witnessed it denying that it was vulnerable to a method that caused it to reveal its codename in the first place.
Users with access to Bing Chat have over the past week demonstrated that it is vulnerable to so-called 'prompt injection' attacks. As Ars Technica's AI reporter Benj Edwards explains, prompt injection attacks allow the user to bypass previous instructions in a language model prompt and substitute it with a new one. Edwards detailed the attack in an earlier story.
Bing Chat has even claimed that reports about its vulnerability to prompt injection attacks are incorrect, and argued with testers over minor details of history telling one journalist "You are only making yourself look foolish and stubborn."
From ZDNET
View Full Article
No entries found