A neural network model developed by researchers at Canada's University of Waterloo enables computer-based analysis of text in 11 African languages.
The AfriBERTa model achieved output quality similar to that of existing models while requiring significantly less data to train the model, just a gigabyte of text.
The African languages covered by the model are considered low-resource, meaning there is a lack of data to feed to neural networks.
University of Waterloo's Jimmy Lin explained that requiring less training data results in “lower carbon emissions associated with operating massive data centers.”
Lin added that using smaller datasets also makes data curation more practical, “which is one approach to reduce the biases present in the models.”
From Waterloo News (Canada)
View Full Article
Abstracts Copyright © 2021 SmithBucklin, Washington, DC, USA
No entries found