he vision of scientific computing in the future relies on computational grids–powerful processors, research instruments, and huge data archives linked by fast networks and advanced software. These grids will be as easy to use as the Web and as convenient as getting water from a faucet. In a tour de force of massively parallel computation, SDSC, the National Center for Supercomputing Applications (NCSA) at the University of Illinois, Argonne National Laboratory, and the Max Planck Institute for Gravitational Physics in Potsdam, Germany, collaborated in a grid computing demonstration that brings that vision one step closer to reality.
In April, researchers in Germany and the United States ran three enormous relativistic astrophysics simulation at SDSC and NCSA. "We used the Cactus Computational Toolkit to compute the evolution of gravitational waves according to Einstein’s theory of General Relativity," said Thomas Dramlitsch, the researcher at the Max Planck Institute who coordinated the run. "Since the experimental proof of the existence of gravitational waves is a major challenge in theoretical and experimental physics and a truly exact computation of these waves is still not possible, due to insufficient computational power, making such large scale simulation runs routine is very important for us."
The runs were the largest simulations involving Einstein’s General Relativity equations to date, according to Ed Seidel, an astrophysicist at the Max Planck Institute and NCSA and head of the research team based in Germany. Each four-hour run of the Cactus code package was set up to use 1,500 processors spread across three supercomputers at NCSA and one at SDSC and linked across the continent by an OC-12 network running data at 622 megabits per second. NCSA used 480 processors of three SGI Origin2000 computers. SDSC used 1,020 processors of Blue Horizon.
In addition to the Cactus Toolkit, two other pieces of advanced software made the distributed-simulation run feasible. One was Globus, a toolkit for programming grid computing systems and the basic software infrastructure for systems that integrate geographically distributed computational and information resources. (See story on page 2.)
The second software tool that enabled the runs was MPICH-G2, a grid-enabled implementation of Message Passing Interface (MPI) version 1.1. Message passing is a standard for coordinating applications run on parallel supercomputers, and MPICH-G2 allows MPI applications to run on multiple computer systems at the same time, including machines of different architectures with different scheduling systems.
"We ran each of these very large simulations as a single Globus job, and they performed very well. Best of all, even though the code had been scaled up to run on 1,500 processors and utilized a long-distance high-performance network connection, it executed at better than 70 percent efficiency," said John Towns, director of NCSA’s Scientific Computing division.
The simulation calculated by the Cactus researchers involved the propagation of gravitational waves. According to General Relativity, violent events such as colliding black holes emit large amounts of gravitational radiation, which although predicted for a century, has not yet been seen. With the advent of new detection technology, scientists hope to detect gravitational waves within the next several years resulting from the collision and merger of two black holes. Such collisions are events, and it is important for scientists to recognize their "signatures" when they do occur.
The relativity simulation run across SDSC and NCSA was done without modifying the physics code; other codes inserted into the Cactus framework could be run in this manner as well. The Cactus code originated in the academic research community and is an open-source computational science toolkit that can tackle complex 3-D simulations, from the effects of General Relativity to chemical reactor flows.
"The Cactus code should be viewed as a framework for all kinds of numerical simulations," Dramlitsch said. "It is useful not only in the theoretical physics of gravitational waves or in astrophysical simulations of cosmology, neutron stars, black holes, and so on, but also in hydrodynamics, quantum mechanics, and other fields. All the capabilities built into Cactus that allow it to do our General Relativity runs can be used by other codes almost immediately."
The modular structure of Cactus encourages both parallel computation across different machine architectures and collaborative code development among different groups. Cactus provides easy access to many cutting-edge software technologies, including Globus, HDF5 parallel file I/O, the PETSc scientific library, adaptive mesh refinement, Web interfaces, and advanced visualization tools.
"Although we didn’t model collisions of black holes on this particular run, we proved what we could do with such distributed simulations–if we had regular access to such a machine," said Seidel. "We could run scenarios at least five times larger than we’ve ever done before! All of our proven, tested routines would actually run quite well in such an environment."
Max Planck Institute for Gravitational Physics
Carlos O. Lousto
Max Planck Institute for Gravitational Physics