Researchers at the University of Texas and Facebook Research have taught an artificial intelligence (AI) system to convert ordinary monaural sounds into nearly three-dimensional sounds.
The researchers trained the AI to pick up visual cues to determine a sound's originating direction, so when provided a video of a scene and mono sound recording, the machine learning system works out this direction and distorts the interaural time and level differences to generate the effect for the listener.
The researchers said, "We call the resulting output 2.5D visual sound—the visual stream helps 'lift' the flat single channel audio into spatialized sound."
The AI was trained on a database of about 2,000 binaural recordings of videoed musical clips.
The researchers said their next step is "to explore ways to incorporate object localization and motion, and explicitly model scene sounds."
From Technology Review
View Full Article
Abstracts Copyright © 2019 SmithBucklin, Washington, DC, USA
No entries found