A team led by researchers at the Massachusetts Institute of Technology (MIT) demonstrated that training machine learning models with synthetic images can outperform traditional training methods using real images.
The strategy, known as "multi-positive contrastive learning," employs StableRep to generate synthetic images using text-to-image models like Stable Diffusion.
StableRep allows for adjustment of the "guidance scale" in the generative model, balancing the diversity and fidelity of the synthetic images.
The researchers also created StableRep+ by adding language supervision. They trained StableRep+ with 20 million synthetic images and determined that it was more efficient than CLIP models trained with 50 million real images. However, the researchers acknowledged that the choice of text prompts are not completely bias-free.
Said MIT's Lijie Fan, "Using the latest text-to-image models, we've gained unprecedented control over image generation, allowing for a diverse range of visuals from a single text input. This surpasses real-world image collection in efficiency and versatility."
From MIT News
View Full Article
Abstracts Copyright © 2023 SmithBucklin, Washington, D.C., USA
No entries found