There are several critical programming elements that must be addressed to ensure system reliability as developers construct programs that can scale to hundreds of thousands of cores and beyond, according to researchers working on the Global Address Space Programing Interface (GASPI).
GASPI is based on remote completion and targets highly scalable dataflow implementation for distributed memory architectures. Failure-tolerance and robust execution in GASPI is achieved through timeouts in all non-local procedures of the GASPI application programming interface (API). GASPI also supports passive communication and mechanisms for global atomic operations. With such operations, low-level functionality such as compare-and-swap or add-and-fetch can be applied to all data in the RDMA memory segments of GASPI.
The GASPI collectives rely on time-based blocking with flexible timeout parameters. The researchers also note that the GASPI API is very flexible and offers full control over the underlying network resources and the pre-pinned GASPI memory segments.
GASPI enables the mapping of memory heterogeneity of modern supercomputers to dedicated memory segments and also offers the potential of having multiple memory management systems and/or multiple applications coexist in the same Global Partitioned Address Space.
From HPC Wire
View Full Article
Abstracts Copyright © 2013 Information Inc., Bethesda, Maryland, USA
No entries found