Scientists at the Massachusetts Institute of Technology (MIT), the University of California, San Diego, and IBM developed the VALHALLA machine learning method to hallucinate images of written words, using them to translate text into target languages.
The researchers trained a neural network to focus on key words and semantics in a sentence, then used a transformer to create a visual hallucination, and a second transformer to execute multimodal translation using outputs from the first.
Training involves a source sentence paired with a ground-truth image, and the same sentence hallucinated to form a text-image pair.
The team compared VALHALLA to other state-of-the-art multimodal and text-only translation techniques, and quantified its performance across 13 tasks. VALHALLA improved text-only translation, and outperformed other methods as sentences became longer.
From MIT News
View Full Article
Abstracts Copyright © 2022 SmithBucklin, Washington, DC, USA
No entries found