Thanks to the work of three computer science students, Siri could one day sound less like a robot and more like a human.
In an effort to make virtual assistants more relatable, Stanford seniors Kate Park, Annie Hu, and Natalie Muenster developed technology that can detect humor in spoken language and respond with laughter. They describe their research in "Laughbot: Detecting Humor in Spoken Language with Language and Audio Cues," which garnered the Best Student Paper award at FICC 2018, the Future of Information and Communication Conference.
Conversational agents like Siri and Alexa are growing in popularity thanks to their ability to carry out basic tasks like playing music and scheduling calendar invites. But the students say that the agents are not living up to their full potential.
"We thought we could push them to the next level—humanize them and equip them with humor detection," says Park. "Imagine if Siri laughed at your jokes!"
The program they built is called Laughbot. When launched, it prompts a user to speak into the computer's microphone. The bot transcribes the audio files using Google's speech application programming interface. It then runs the transcription and the original audio file through a model called a recurrent neural network. This step is what Muenster calls "the meat of the project."
Neural networks are algorithms that are inspired by the way a brain functions and enable a computer to learn a task by analyzing training examples. The system does this by finding patterns in the data that consistently correlate with a label, in this case, funny or not funny. Typically, a neural network assumes that input features are independent of one another. But when it comes to understanding speech, such as a funny statement, the model needs to understand words as a sequence, which is where recurrent neural networks come in.
"A neural network is powerful because it has this concept of weights that we multiply into each input to tell the network how important the input is," says Hu. "Recurrent neural networks differ because they have 'memory,' which lets them take into account feedback loops."
The students' research began last spring as a project for CS224S Spoken Language Processing. After successfully testing Laughbot for their professors, Dan Jurafsky and Andrew Maas, the trio submitted their research paper to FICC 2018 in Singapore, where it was accepted. The paper was presented at the conference in April, and won the award for Best Student Paper.
No entries found