acm-header
Sign In

Communications of the ACM

ACM News

Illinois Students Build 33-Teraflop Cluster From Gpus


View as: Print Mobile App Share:

For Tyler Takeshita, helping to construct a supercomputer was like meeting a familiar friend in-person for the first time. Takeshita, a graduate student in chemistry at the University of Illinois, has been interested in computers since a young age and belongs to a computational research group led by chemistry professor Thom Dunning, director of the National Center for Supercomputing Applications (NCSA). He felt that in order to take his knowledge to the next level he needed to delve right inside the system.

"It was very helpful to get hands-on experience," Takeshita says. "It was almost like putting a face to the name when you have an idea of what things look like and how they fit together."

View a video of students assembling the GPU-based system.

Takeshita, along with around 15 other students, participated in building a supercomputer to compete for a spot on the Green500 list—a ranking of the most energy-efficient supercomputers in the world. The project is part of an independent study course led by Bill Gropp, the Bill and Cynthia Saylor Professor of Computer Science, and Wen-mei Hwu, the AMD Jerry Sanders Chair of Electrical and Computer Engineering. NCSA's Mike Showerman provided cluster-building expertise and assistance.

"The idea is kind of original because it's not trying to use a big machine that consumes a lot of energy to be really fast," says computer science major Chengyin Liu. "We were interested in efficient power consuming. I found the idea very interesting."

According to Showerman, the first step was selecting the perfect equipment. In this case, that was Nvidia's C2050 graphics-processing unit (GPU).

While GPUs were originally developed to render graphics, today they are being adopted as computational accelerators; with the proper software adaptations, some codes can run substantially faster using GPUs' many-core architecture. Even as the students were building their cluster, China announced that their GPU-CPU hybrid machine, Tianhe-IA, was the fastest supercomputer in the world.

Nvidia donated 128 C2050 units to the Illinois CUDA Center of Excellence, led by Hwu, and Nvidia research scientist Sean Treichler spent time on campus helping to plan and build the cluster. QLogic also donated a portion of the interconnect.

Donating the technology seemed like a great opportunity for Nvidia, Treichler says.

"We are interested in researching power-efficient computing," he says. "And working with students was great—they have a lot of good ideas that aren't mainstream. They often have new ideas for old problems."

Treichler says the students had some good ideas when it came to connecting everything together in a creative way. To save money and reduce the cluster's footprint, the team used nontraditional materials, like wood and Plexiglass, to mount the motherboards.

The students had three big sessions working tirelessly in the new National Petascale Computing Facility to build the cluster, which is currently called ECOG but might be rechristened "Green Street" in honor of one of the campus's main thoroughfares. They also needed to configure all of the settings and check each memory card, processor and cable for problems before finally running benchmark code to test their cluster's performance and another test to gather power-usage data by the deadline for the Top500 list, a ranking of the fastest supercomputers in the world. Finally, the team recorded performance of 33.6 teraflops (or 33.6 trillion calculations per second) and 938 megaflops per watt.

The students' system was ranked 403 on the latest Top500 list announced Sunday (Nov. 14). The latest Green500 charts will be released Thursday (Nov. 18).

Apart from the competitions, Showerman feels the project was a success. He says the Univeristy will likely continue to have cluster-building classes in coming years.

"There is a value in learning—that's what higher education is all about," he says. "But is it valid to leave the University knowing things but not having the ability to do anything? This project was very unique in that it gave students access to something few people ever have the opportunity to do."

Now that it's built, the cluster will continue to be a teaching and research tool.

"We will use the cluster to conduct studies on how real applications may need to be adapted to run well on such power-constrained systems," says Hwu. "This will likely be the norm in exascale computers in the next decade."

Showerman says there are also people interested in using the cluster for physics and chemistry applications. "We are already running some physics codes on the system with good performance results," he says.


 

No entries found

Sign In for Full Access
» Forgot Password? » Create an ACM Web Account