acm-header
Sign In

Communications of the ACM

ACM Careers

Deep Learning Stretches ­p to Scientific Supercomputers


View as: Print Mobile App Share:
Cori supercomputer

The team achieved a peak rate between 11.73 and 15.07 petaflops (single-precision) when running its data set on the Cori supercomputer.

Credit: NERSC

Machine learning, a form of artificial intelligence, enjoys unprecedented success in commercial applications. However, the use of machine learning in high performance computing for science has been limited. Why? Advanced machine learning tools weren't designed for big data sets, like those used to study stars and planets. A team from Intel, National Energy Research Scientific Computing Center (NERSC), and Stanford changed that. They developed a 15-petaflop deep-learning system and demonstrated its ability to handle large data sets via test runs on the Cori supercomputer.

Using machine learning techniques on supercomputers, scientists could extract insights from large, complex data sets. Powerful instruments, such as accelerators, produce massive data sets. The new software could make the world's largest supercomputers able to fit such data into deep learning uses. The resulting insights could benefit Earth systems modeling, fusion energy, and astrophysics.

They system is described in "Deep Learning at 15PF: Supervised and Semi-Supervised Classification for Scientific Data," published in the Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis.

Machine learning techniques hold potential for enabling scientists to extract valuable insights from large, complex data sets being produced by accelerators, light sources, telescopes, and computer simulations. While these techniques have had great success in a variety of commercial applications, their use in high performance computing for science has been limited because existing tools were not designed to work with the terabyte- to petabyte-sized data sets found in many science domains.

To address this problem a collaboration among Intel, NERSC, and Stanford University has been working to solve problems that arise when using deep learning techniques, a form of machine learning, on terabyte and petabyte data sets. The team developed the first 15-petaflop deep-learning software. They demonstrated its scalability for data-intensive applications by executing a number of training runs using large scientific data sets. The runs used physics- and climate-based data sets on Cori, a supercomputer located at NERSC. They achieved a peak rate between 11.73 and 15.07 petaflops (single-precision) and an average sustained performance of 11.41 to 13.47 petaflops.

This research used resources at NERSC, a U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research user facility.


 

No entries found

Sign In for Full Access
» Forgot Password? » Create an ACM Web Account