Microsoft's Artificial Intelligence and Research Group on Sunday said it achieved a 5.1% error rate for its speech-recognition technology, an improvement over its 2016 record of 5.9% and IBM's 2017 milestone of 5.5%.
Microsoft's Xuedong Huang credits the achievement to "a series of improvements to our neural net-based acoustic and language models."
Huang says the team introduced an additional convolutional neural network integrated with a bidirectional long-short-term memory model for better acoustic modeling.
He also notes the researchers' approach to blend predictions from multiple acoustic models currently does so at both the frame/senone and word levels.
In addition, Huang says the team fortified the recognizer's language model by employing the complete history of a dialog session to predict what is likely to come next, which enables the model to adjust to the topic and local context of a conversation.
From GeekWire
View Full Article
Abstracts Copyright © 2017 Information Inc., Bethesda, Maryland, USA
No entries found