acm-header
Sign In

Communications of the ACM

ACM News

Researchers Poke Holes in Safety Controls of ChatGPT and Other Chatbots


View as: Print Mobile App Share:

Carnegie Mellon University's Zico Kolter, right, and Andy Zou, were among researchers who found a way of circumventing the safety measures on all the major chatbots platforms.

Credit: Marco Garcia/The New York Times

When artificial intelligence companies build online chatbots, like ChatGPT, Claude and Google Bard, they spend months adding guardrails that are supposed to prevent their systems from generating hate speech, disinformation and other toxic material.

Now there is a way to easily poke holes in those safety systems.

In a report released on Thursday, researchers at Carnegie Mellon University in Pittsburgh and the Center for A.I. Safety in San Francisco showed how anyone could circumvent A.I. safety measures and use any of the leading chatbots to generate nearly unlimited amounts of harmful information.

Their research underscored increasing concern that the new chatbots could flood the internet with false and dangerous information despite attempts by their creators to ensure that would not happen. It also showed how disagreements among leading A.I. companies were creating an increasingly unpredictable environment for the technology.

From The New York Times
View Full Article

 


 

No entries found

Sign In for Full Access
» Forgot Password? » Create an ACM Web Account