Headlines at Hopkins
News Release

Office of News and Information
Johns Hopkins University
901 South Bond Street, Suite 540
Baltimore, Maryland 21231
Phone: 443-287-9960 | Fax: 443-287-9920

December 8, 2008
CONTACT: Lisa De Nike
(443) 287-9960

JHU-led Team Wins Supercomputing
Storage Challenge

A computer facility that could eventually handle enough data to fill 1 billion diskettes has won the Storage Challenge at SC08, the 8th annual International Conference for High Performance Computing, Networking, Storage and Analysis. The competition was recently held in Austin, Texas.

Designed by a team led by computer scientist and theoretical astrophysicist Alexander Szalay of the Johns Hopkins University, The GrayWulf System (named in honor of Szalay's friend and collaborator, the late Jim Gray of Microsoft Research) combines inexpensive hardware and software into a single innovative platform that can analyze and process petabyte- scale data sets. (A petabyte is equal to 1,000 terabytes, or one quadrillion bytes. Facebook users, for instance, have stored one petabyte, or about 10 billion photos, on the social networking site.)

According to Szalay, GrayWulf will enable scientists to quickly and efficiently search through massive amounts of data to locate and identify patterns that will lead to new discoveries.

"GrayWulf, built from simple, inexpensive components, was consciously designed to sift discoveries at a rate much higher — and a cost much lower — than anyone ever thought possible," said Szalay, Alumni Centennial Professor in the Henry A. Rowland Department of Physics and Astronomy at JHU. "It will help researchers do science directly in the database, teasing out relationships within areas such as astrophysics, hydrodynamic turbulence, environmental sensor networking and even, potentially, global climate change."

The winning team included several other scientists and staff members from Johns Hopkins, as well as experts from Microsoft, Inc., the University of Illinois Chicago, the University of Hawaii and Dell, Inc.

In the competition, GrayWulf was able to sift through information gathered as part of the Sloan Digital Sky Survey to locate quasars (distant astronomical objects characterized by changing brightness) in 12 minutes: a search which took other computing systems 13 days to handle.

"GrayWulf is significant because the archetypal scalable design supports the new paradigm of data-intensive computing," said Tony Hey, corporate vice president for Microsoft External Research. "Built on the pioneering database work of Microsoft researcher Jim Gray, GrayWulf is a tool that will drive scientific discovery and innovation by giving scientists the power to efficiently process and analyze massive amounts of data."

Szalay said that events such as the SC08 competition do more than simply bestow bragging rights on the winners; they also have an impact on how future science and engineering research — both of which are generating tremendous data sets — will be done. Szalay says that the successful development of tools for data-intensive science in one field — astrophysics, for instance — can be generalized and applied to other fields, and will result in cross-disciplinary pollination.

According to team member and JHU information technology system administrator Alainna Wonders, the prize also serves as recognition of the enormous amount of work that the team has done in designing GrayWulf.

"We've proven that databases are an effective way to manage large amounts of data for scientific research," said Wonders.

Funding for GrayWulf was provided by the Gordon and Betty Moore Foundation, Microsoft Research and the Panoramic Survey Telescope and Rapid Response System (Pan-STARRS).

Related Web sites
> Alex Szalay
> International Conference for High Performance Computing
> Access to the Sloan Digital Sky Survey database