ACM

Communications of the ACM

Home/News/Baidu's New AI Can Mimic Your Voice After Listening.../Full Text

ACM TechNews

Baidu's New AI Can Mimic Your Voice After Listening to It for Just One Minute

By Digital Trends
March 7, 2018
Comments

View as: Print Mobile App Share:

Baidu's text-to-speech synthesis system can mimic users' voices. — Baidu researchers say they have developed an artificial intelligence that can learn to precisely mimic a person's voice based on less than 60 seconds' worth of listening to it.

Credit: Digital Trends

Researchers at Chinese search giant Baidu say they have developed an artificial intelligence that can learn to precisely mimic a person's voice based on less than 60 seconds' worth of listening to it.

They note this milestone uses Baidu's text-to-speech synthesis system Deep Voice, which was trained on more than 800 hours of audio from 2,400 speakers.

The team says Deep Voice requires only 100 five-second segments of vocal training data to sound its best, but a version trained on only 10 five-second samples was able to deceive a voice-recognition system more than 95% of the time.

"We see many great use cases or applications for this technology," says Baidu's Leo Zou. "For example, voice cloning could help patients who lost their voices. This is also an important breakthrough in the direction of personalized human-machine interfaces."

Zou also thinks the technique could advance the creation of original digital content.

From Digital Trends
View Full Article

No entries found