Northeastern University's Greg Kerr has developed a process that will make it possible for supercomputers running on the InfiniBand system to save their data part way through a computation, preventing the loss of progress due to a computer crash or bug that would normally require a machine to be restarted.
Kerr says the system is scalable, and it can be used on small computer clusters as well as the most advanced supercomputers. "This is the networking technology behind some of the world's largest computers, and yet the number of people who understand the internals of the InfiniBand technology is very small, largely because it is relatively new," says Northeastern professor Gene Cooperman.
Kerr's work could allow other researchers to more efficiently complete large calculations on sophisticated computers. This summer Cooperman plans to apply Kerr's process to computations done on the Oak Ridge National Laboratory's supercomputers. "I think we're close," Kerr says. "We've got the main points proven and now we need the summer to iron everything out and work out the bugs."
From Northeastern University News
View Full Article
Abstracts Copyright © 2011 Information Inc. , Bethesda, Maryland, USA
No entries found