ACM

Communications of the ACM

Home/News/Massive Data For Miniscule Communities/Full Text

ACM TechNews

Massive Data For Miniscule Communities

By MSU News
August 7, 2012
Comments

View as: Print Mobile App Share:

C. Titus Brown, Michigan State University assistant professor in bioinformatics, has found a way to sift and analyze massive amounts of data on microbial communities.

Credit: MSU

Michigan State University (MSU) researchers have developed a computational technique that relieves logjams that commonly occur in big data sets.

The researchers note that microbial communities' genomic data is easy to collect, but the data sets are so large that they can overwhelm conventional computers. "To thoroughly examine a gram of soil, we need to generate about 50 terabases of genomic sequence--about 1,000 times more data than generated for the initial human genome project," says MSU professor C. Titus Brown.

He notes the strategy is unique in that it was created using small computers rather than supercomputers, which is the usual approach for bioinformatics research. The method utilizes a filter that folds the data set up using a special data structure, which enables computers to analyze small portions of the data at a time. The technique creates a 40-fold decrease in memory requirements, enabling scientists to sift through large volumes of data without the use of a supercomputer.

The researchers made the technique's source code publicly available to encourage others to extend it.

From MSU News
View Full Article

No entries found