NSF announces plan for universities to create global data grid

September 25, 2001

GAINESVILLE, Fla. — Scientists at 40 universities and research institutions on four continents will get access to more computing power than currently available at the world’s top research centers under an ambitious initiative led by the University of Florida.

The National Science Foundation announced today (9/25) that UF will lead a consortium of 15 universities and four national laboratories in a $13.65 million bid to build the International Virtual Data Grid Laboratory, or iVDGL. By seamlessly connecting an international network of powerful computers at 40 locations in the United States, Europe and Asia, the grid will allow scientists worldwide to view and analyze the huge amounts of data flowing from experiments in high-energy and nuclear physics, gravitational waves, astronomy, biology and other areas.

The announcement comes a year after the NSF provided $11.9 million for the Grid Physics Network, or GriPhyN, which launched the basic computing research that will underpin the construction and operation of the far-reaching grid. UF also leads that effort, with the University of Chicago acting as co-leader. The Particle Physics Data Grid, or PPDG, a Department of Energy-funded grid, will also provide needed resources.

“This grant gives us the wherewithal to build a truly global facility,” said Paul Avery, the project’s principal investigator and a UF physics professor. “To operate the grid, we’ll use the software developed by GriPhyN, PPDG and other projects and take advantage of new supercomputing resources and ultra high-speed networks linking the United States and Europe.”

The grid — an early version of which is expected to be online next year — might be imagined as an electric utility grid for next-generation computer users.

As with the utility grid, it will tap into computing power at multiple locations to bring ultra-powerful computing “service” to widely distributed consumers. While utility grids often end at national borders, however, the computer grid will have no such limitations. It will reach into Europe and Asia through partners in England, Italy, Japan and other countries. The grid will link to other current and planned grids, including the European Union DataGrid, and will be able to draw on resources from U.S. supercomputing facilities during times of peak demand.

“Our laboratory will be the largest grid ever built, in terms of number of sites, geographical distribution and data capacity,” Avery said. “The iVDGL will link the resources of dozens of universities throughout the world, plus several national laboratories in the U.S., Europe and Asia, into a single computational engine.”

Avery said the grid will have at its core thousands of computers equipped with Intel Pentium processing units. Each university or research center will install about 100 of these computers on-site, each using the public, open-source Linux operating system and communicating with other grid members’ ultra high-speed national and international networks.

The computing power expected to be generated through the grid is staggering, Avery said. The grid will be capable of handling quantities of data measured in petabytes, where one petabyte is 1 million gigabytes, or roughly the amount of data contained 100,000 personal computer hard drives. Eventually, the computational speed could be measured in petaflops, where one petaflop equals one thousand trillion calculations per second. The grid will be powerful enough for hundreds of users worldwide to run jobs simultaneously, although truly huge processing jobs may use the entire grid.

Such a powerful information distribution network is needed because 21st-century physics, biology, astronomy and engineering increasingly depend on the ability to manage and access huge, or hugely complex, quantities of data, said Harvey Newman, a professor of physics at the California Institute of Technology and one of the leaders of the iVDGL effort.

For example, scientists using high-energy colliders to probe the origins of matter must record the effects of billions of proton collisions per year. Biologists or chemists studying proteins, meanwhile, work with exceedingly complex data gathered from many types of experiments. Among other large-scale experiments, the grid will be a computing resource for the Laser Interferometer Gravitational-wave Observatory and CERN, the world’s largest particle physics center near Geneva.

“The scale of the iVDGL, and the close connections that we have established with international scientific endeavors, also makes it a unique laboratory for computer science research,” said Ian Foster, co-director of the project, a professor of computer science at the University of Chicago and associate director of the Mathematics and Computer Science Division of Argonne National Laboratory.

The unique ability of the grid to tap resources regardless of location has benefits beyond scientific data analysis. “Faculty and students at small colleges and universities will for the first time be equal partners in international research,” said Manuela Campanelli, a professor of physics at the University of Texas at Brownsville, and leader of the iVDGL outreach effort. “Students will gain a sense of excitement from being able to participate in forefront research.”

The participants in the U.S.-funded grid are diverse, including three predominantly minority universities: Hampton University, Salish Kootenai College and the University of Texas at Brownsville.

The others are the University of Florida, the University of Chicago, California Institute of Technology, the University of California San Diego, Indiana University, Boston University, University of Wisconsin at Milwaukee and Madison, Pennsylvania State University, Johns Hopkins University, Northwestern University and the University of Southern California.

Participating national laboratories are Fermi National Accelerator Laboratory, Brookhaven National Accelerator Laboratory, Argonne National Accelerator Laboratory and Stanford Linear Accelerator Laboratory. Many other universities and laboratories in Europe, Asia and Australia also will take part.