acm-header
Sign In

Communications of the ACM

ACM TechNews

Learning Spoken Language


View as: Print Mobile App Share:
A new system is learning to distinguish words phonetic components, without human annotation of training data.

Researchers at the Massachusetts Institute of Technology have developed a machine-learning system that can learn to distinguish spoken words.

Credit: Jose-Luis Olivares/MIT

Massachusetts Institute of Technology (MIT) researchers have developed a machine-learning system that can learn to distinguish spoken words, as well as lower-level phonetic units, such as syllables and phonemes.  

The researchers say the technology could aid in the development of speech-processing systems for languages that are not widely spoken and do not have the benefit of decades of linguistic research on their phonetic systems.  The technology also could help make speech-processing systems more portable because information about lower-level phonetic units could help solidify distinctions between different speakers' pronunciations.

In addition, the MIT system acts directly on raw speech files, which could prove to be much easier to extend to new sets of training data and new languages.  Finally, the researchers say the system could offer some insights into human speech acquisition.  

The key to the system's performance is a "noisy-channel" model of phonetic variability.  The researchers modeled this phenomenon by borrowing an idea from communication theory, treating an audio signal as if it were a sequence of perfectly regular phonemes that had been sent through a noisy channel.  

The goal of the machine-learning system is to learn the statistical correlations between the "received" sound and the associated phoneme.

From MIT News
View Full Article

 

Abstracts Copyright © 2015 Information Inc., Bethesda, Maryland, USA


 

No entries found

Sign In for Full Access
» Forgot Password? » Create an ACM Web Account