Hacks, or adversarial attacks, could become commonplace unless artificial intelligence (AI) finds a way to outsmart them. Now, researchers at MIT have found a new way to give AI a defensive edge. They describe their work in "Adversarial Examples Are Not Bugs, They Are Features," presented at ICLR 2019, the Seventh International Conference on Learning Representations.
The work could not only protect the public. It also helps reveal why AI falls victim to such attacks in the first place, says Zico Kolter, a computer scientist at Carnegie Mellon University, who was not involved in the research. Because some AIs can spot patterns in images that humans can't, they are vulnerable to those patterns and need to be trained with that in mind, the research suggests.
Andrew Ilyas, a computer scientist at MIT, and one of the paper's authors, says engineers could change the way they train AI. Current methods of securing an algorithm against attacks are slow and difficult. But if the training data is modified to have only human-obvious features, any algorithm trained on it won't recognize—and be fooled by—additional, perhaps subtler, features.
Indeed, when the team trained an algorithm on images without the subtle features, their image recognition software was fooled by adversarial attacks only 50% of the time, the researchers report. That compares with a 95% rate of vulnerability when the AI was trained on images with both obvious and subtle patterns.
From Science
View Full Article
No entries found