One Giant Leap (Continued)

page 1 | 2

'40 Days and 40 Nights'
Back in Chippewa Falls, SGI's Dick Harkness and his team of 200 were ready. While NASA briefed Congress and the Office of Management and Budget on the Columbia concept, SGI's manufacturing facility prepared workers to adapt to new processes. "Manufacturing flows were completely transformed to accommodate faster, more efficient builds. SGI's factory personnel worked "40 days and 40 nights" to meet production demands, says Harkness. Assembly and QA of 512-processor Altix systems - until then a rare and involved event - quickly became a streamlined and easily repeatable manufacturing process.

Another challenge for SGI: Squeezing more than 10,000 processors into NASA's supercomputing room in Mountain View, Calif., meant Columbia had to incorporate eight 512-processor nodes made new high-density, high-bandwidth version of the SGI Altix 3000 system.

"There simply wasn’t room on the floor for 20 traditional Altix nodes," says Bill Thigpen, NASA's Columbia project manager. "We needed eight nodes to be half the size of the original Altix 3000 systems for us to get all the hardware in the room."

For SGI, that spelled a challenge, since the Bx2 hadn’t even achieved engineering release when Columbia plans were cemented. Indeed, based on typical parts delivery schedules, SGI only was to receive raw Bx2 parts by the time the finished systems were due at NASA. But SGI's engineering and manufacturing group joined forces to deliver eight Altix Bx2 systems weeks ahead of schedule.

The team met an even greater challenge, accelerating by four months the delivery of optional water-cooled doors—the first ever to be offered from other than a Cray product—that allow the denser Bx2 nodes to avoid overheating as they operate amid 12 air-cooled Altix nodes. "The water-cooled doors were crucial to this installation," recalls Thigpen. "This wouldn’t have worked without them."

'Some Kind of Record'

Side view of Columbia supercomputer
When installation was completed Oct. 12, Columbia became the world’s most advanced Linux supercomputer.
The 19 new Altix nodes joined the Kalpana system at NASA Ames beginning in late June, and with them came a 440-terabyte SGI® InfiniteStorage solution to help NASA store and manage terabytes of new data generated every day.

For those on site, the rally continued, says Bill Thigpen, NASA's Columbia project manager. "Here we were, pulling out old systems and installing new ones, replumbing our water cooling system, and literally reconfiguring the floor on the fly, and meanwhile we had a large community of users who needed access to our systems every day."

According to NASA, the installation of the Altix nodes themselves was surprisingly easy. "It's phenomenal how quickly this combined team was bringing the nodes up and providing them to users to do real science," says Thigpen. "We had people from throughout NASA and several universities using the first installations within a week of having them hit the floor."

Jim Taft agrees. "In some cases, a new Altix was in production in as little as 48 hours. This is starkly different from implementations of architectures not based on the SGI architecture, which can take many months to bring to a reliable state and ready for science."

Japan's 5,120-processor Earth Simulator, for instance, wasn't fully usable for more than four years after inception of the project. "Imagine what you've lost in that time," says Taft, "not only in productivity, but in processor obsolescence as well. You're generations behind the curve before you even get started."

For those who drove the Columbia effort, however, the achievement symbolizes much more than mere Teraflops, or even supercomputing superiority. "This effort created a powerful national resource," notes SGI CEO Bob Bishop. "This is a story about opportunity and drive and a willingness to stand up to the seemingly impossible - and make it happen. With the building of this great system completed, the work that will be performed will literally make our world and our universe safer for mankind. What could be more important than that?"


page 1 | 2

Image courtesy of NASA