Supercomputing is evolving toward hybrid and accelerator-based architectures with millions of cores. The Hardware/Hybrid Accelerated Cosmology Code (HACC) framework exploits this diverse landscape at the largest scales of problem size, obtaining high scalability and sustained performance. Developed to satisfy the science requirements of cosmological surveys, HACC melds particle and grid methods using a novel algorithmic structure that flexibly maps across architectures, including CPU/GPU, multi/many-core, and Blue Gene systems. In this Research Highlight, we demonstrate the success of HACC on two very different machines, the CPU/GPU system Titan and the BG/Q systems Sequoia and Mira, attaining very high levels of scalable performance. We demonstrate strong and weak scaling on Titan, obtaining up to 99.2% parallel efficiency, evolving 1.1 trillion particles. On Sequoia, we reach 13.94 PFlops (69.2% of peak) and 90% parallel efficiency on 1,572,864 cores, with 3.6 trillion particles, the largest cosmological benchmark yet performed. HACC design concepts are applicable to several other supercomputer applications.
Cosmological surveys are our windows to the grandest of all dynamical systems, the Universe itself. Scanning the sky over large areas and to great depths, modern surveys have brought us a remarkably simple, yet mysterious, model of the Universe, whose central pillars, dark matter and dark energy, point to new, and even more fundamental discoveries. The pace of progress continues unabated—the next generation of sky surveys demand tools for scientific inference that far exceed current capabilities to extract information from observations.
The already important role of cosmological simulations is expected to undergo a sea change as the analysis of surveys moves over to an approach based entirely on forward models of the underlying physics, encompassing as well the complex details of survey measurements. Such an end-to-end paradigm, based on the ability to produce realistic "universes" on demand, will stress the available supercomputing power to its limits.
No entries found