ACM

Communications of the ACM

Home/Careers/Deep Learning Networks Prefer the Human Voice/Full Text

ACM Careers

Deep Learning Networks Prefer the Human Voice

By Columbia University
April 8, 2021
Comments

View as: Print Mobile App Share:

Neural network image classification systems might reach higher levels of performance if they are programmed with sound files of human language rather than with numerical label representations of photos, according to a study by Professor Hod Lipson and researchers at Columbia University.

The researchers discovered in a side-by-side comparison that a neural network whose "training labels" consisted of sound files reached higher levels of performance in identifying objects in images, compared to another network that had been programmed in a more traditional manner using binary inputs.

"Our findings run directly counter to how many experts have been trained to think about computers and numbers," says researcher Boyuan Chen.

The team describes its work in "Beyond Categorical Label Representations for Image Classification," to be presented in May at ICLR 2021, the Ninth International Conference on Learning Representations.

From Columbia University
View Full Article

No entries found