ACM

Communications of the ACM

Home/News/AI Can Edit Video in Real Time to Sync Audio to People's.../Full Text

ACM TechNews

AI Can Edit Video in Real Time to Sync Audio to People's Lips

By New Scientist
September 17, 2020
Comments

View as: Print Mobile App Share:

Daniel Radcliffe as Harry Potter in the blockbuster movie series. — Dubbed films could look more realistic by matching actors mouths to the sounds purportedly coming out of them.

Credit: Everett Collection Inc./Alamy

Researchers at India's International Institute of Information Technology (IIIT) have developed artificial intelligence (AI) that tweaks video footage in real time to sync audio to people's lips.

The team trained a generative adversarial network algorithm on short clips, and it marked out people's lip shapes as they spoke.

For a given sound and video footage of someone talking, a generator AI adjusted lip imagery to match spoken words, while two discriminator AIs differentiated real from fake footage.

One discriminator focused on the realism of mouth shapes, and penalized the generator for mismatched sound and lip movements; the second discriminator checked visual quality, flagging glitches or unnatural visual artifacts around the mouth.

Because it was trained on human faces, the algorithm is more accurate on video footage of people than on computer-generated characters; the IIIT team hopes to use the AI for multilingual video dubbing.

From New Scientist
View Full Article

No entries found