acm-header
Sign In

Communications of the ACM

ACM TechNews

Massive Data For Miniscule Communities


View as: Print Mobile App Share:
C. Titus Brown

C. Titus Brown, Michigan State University assistant professor in bioinformatics, has found a way to sift and analyze massive amounts of data on microbial communities.

Credit: MSU

Michigan State University (MSU) researchers have developed a computational technique that relieves logjams that commonly occur in big data sets.

The researchers note that microbial communities' genomic data is easy to collect, but the data sets are so large that they can overwhelm conventional computers. "To thoroughly examine a gram of soil, we need to generate about 50 terabases of genomic sequence--about 1,000 times more data than generated for the initial human genome project," says MSU professor C. Titus Brown.

He notes the strategy is unique in that it was created using small computers rather than supercomputers, which is the usual approach for bioinformatics research. The method utilizes a filter that folds the data set up using a special data structure, which enables computers to analyze small portions of the data at a time. The technique creates a 40-fold decrease in memory requirements, enabling scientists to sift through large volumes of data without the use of a supercomputer.

The researchers made the technique's source code publicly available to encourage others to extend it.

From MSU News 
View Full Article

Abstracts Copyright © 2012 Information Inc., Bethesda, Maryland, USA 


 

No entries found

Sign In for Full Access
» Forgot Password? » Create an ACM Web Account