acm-header
Sign In

Communications of the ACM

ACM TechNews

World's Fastest Supercomputer Can't Run a Day without Failure


View as: Print Mobile App Share:
Frontier supercomputer at Oak Ridge National Laboratory

Initially promised to come online in 2022, the Frontier supercomputer is still not officially deployed.

Credit: U.S. Department of Energy

Oak Ridge National Laboratory's Frontier supercomputer experiences numerous hardware failures on a daily basis.

Frontier, which has not yet been deployed officially, aims to provide up to 1.685 FP64 ExaFLOPS peak performance via AMD's 64-core EPYC Trento processors, Instinct MI250X compute graphics processing units, and HPE's Slingshot interconnections at 21 MW of power.

"We are working through issues in hardware and making sure that we understand [what they are]," says Oak Ridge Leadership Computing Facility's Justin Whitt. "You are going to have failures at this scale. Mean time between failure on a system this size is hours; it's not days."

From Tom's Hardware
View Full Article

 

Abstracts Copyright © 2022 SmithBucklin, Washington, DC, USA


 

No entries found

Sign In for Full Access
» Forgot Password? » Create an ACM Web Account