Pacific Northwest National Laboratory (PNNL) researchers have used Graph Engine for Multithread Systems (GEMS), a multilayer software framework for querying graph databases, to customize commodity, distributed-memory high-performance computing (HPC) clusters and apply graph algorithms to large-scale data sets on clusters.
The researchers say incorporating GEMS takes advantage of HPC query solutions and makes results more predictable.
In a comparison with alternative approaches, GEMS provided noticeable speedups, especially with larger data sets, according to the researchers. Data mining through graph methods using the GEMS framework resulted in more efficient use of space and added performance by exploiting graph parallelism.
GEMS' optimization process identifies the best execution plan among several candidates, according to a cost model and a cardinality estimator. It then performs data-flow and call-graph analysis to improve task-level parallelism exploitation, reduce data movement, and hold back the memory footprint of data structure. This process also identifies the specific sequence of basic operations that can be combined into more efficient complex operations.
"GEMS clearly represents a promising solution to tackle the 'too big' challenge as it already is able to process data in the scale of 10 billion triples, which is prohibitive for most available systems," says PNNL researcher Vito Giovanni Castellana.
From Pacific Northwest National Laboratory
View Full Article
Abstracts Copyright © 2015 Information Inc., Bethesda, Maryland, USA
No entries found