The most significant recent development in gaming and graphics-intensive computing has been the availability of supercomputer-level graphics on commodity-priced graphics processing units (GPUs). Leapfrogging earlier limits, Nvidia's Quadro2 Pro GPU delivers 31 million polygons per second at a fill rate of over one billion texture-mapped pixels per second, which makes it the world's fastest GPU by some measures. This speed comes despite a raft of new hardware operations including multiple shadings per pixel, multitexturing, texture and lighting, bump mapping and other compute-intensive operations not found until recently in animated PC graphics. The new Nvidia chips are so effective they have found outside applications as the graphical heart of Microsoft's new Xbox game player and at the core of the multifunction avionics display for the F-22 fighter aircraft.
The really big graphics applications like Caves, Power Walls (high-resolution wall-size displays), and open-canopy flight simulators, however, rely not on a single GPU but on systems built from many graphics pipelines working together in parallel. The top-of-the-line Silicon Graphics Onyx 3800, for example, manages up to 16 simultaneous graphics pipelines for an upper limit of 210 million polygons per second and a fill rate of 12 billion pixels per second. Achieving coherent graphics on this scale requires elaborate coordination and management to rebalance loads among pipelines at the crucial transition in processing between object parallelism and image parallelism. The SGI InfiniteReality architecture designs graphics pipelines to accept broadcast primitives at just this point. General-purpose GPUs designed for PC graphics and embedded applications do not have the luxury of this "sort-middle architecture" because they are designed to work as self-contained standalone units.
Making the transition from competing with desktop workstations to taking on refrigerator-sized, rack-mounted capital investments requires solving the problem of parallel processing using commodity GPUs. Because the graphics pipelines themselves are inaccessible and closely tied to other parts of the PC architecture, the most convenient approach lies in clustering PCs to achieve "sort-first" machine-wise parallelization. While this architecture is limited by network bandwidth and is less efficient than the "sort-middle" approach of high-end graphics vendors, the huge cost savings and the speed advantage of commodity GPUs makes it hard to resist.
We are aware of two such solutions. WireGL (now Chromium),1 developed at Stanford University, is a general-purpose system for scalable interactive rendering on a cluster of workstations. WireGL replaces the OpenGL libraries, which allows OpenGL calls to be parallelized and rendered through the cluster. Reported performance of 70 million triangles per second for a 32-PC configuration (16 compute and 16 rendering nodes) compares well with the achievable performance estimate of 29 million polygons per second for a 16-pipeline SGI InfiniteReality configuration. Using PCs and data projectors found in most labs, WireGL can allow researchers to experiment with previously rare and expensive large-scale high-resolution displays.
The same hardware can be used to run CaveUT (described in this issue by Jacobson and Hwang) to create an immersive panoramic Cave-like display. CaveUT follows the same cluster rendering approach as WireGL but takes advantage of the Unreal engine's networking and graphics synchronization capabilities to do the processing. Because game players' views must be computed independently by clients, the opportunity for load balancing is lost. Nevertheless, given lightweight communication protocols, the power of the new commodity GPUs and gigahertz-plus GPUs, display refresh rates have become the limiting factor.
1Humphreys, G., Eldridge, M., Buck, I., Stoll, G., Everett, M., and Hanrahan, P. WireGL: A scalable graphics system for clusters. In Proceedings of the 2001 Conference on Computer Graphics (Los Angeles, CA, 2001), 129140.
©2002 ACM 0002-0782/02/0100 $5.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2002 ACM, Inc.
No entries found