NATIONWIDE
COMUTING FABRIC INFORMATION
TECHNOLOGY TRIATHLON ENABLING
NEXT-GENERATION SCIENCE imulations
on high-performance computers allow molecular biologists to discover
how proteins work and predict how drugs might be designed to target
diseases. Until recently, the most advanced simulations were limited
to proteins of, at most, 50,000 atoms. Today, thanks to advances
in algorithms and NPACIs Blue Horizon, biochemists and mathematicians
at UCSD have performed calculations on cellular structures with
more than 1 million atoms. In the near future, these researchers
may be able to examine structures consisting of tens of millions
of atoms and begin to understand the fundamental forces that drive
cellular functions. Such revolutionary advances across the physical
and life sciences will be possible with the installation of the
TeraGrid.
 |
|
Figure1. Layout
of the Distributed Terascale Facility
While
the nodes of the DTF are geographically distributed, the
integrated hardware will make the system behave as one colossal
computing machine.
|
In early August, the
National Science Foundation (NSF) awarded $53 million to four
U.S. research institutions to build and deploy a distributed terascale
facility (DTF). The DTF will be the largest, most comprehensive
infrastructure ever deployed for scientific research. It will
compute at more than 13.6 teraflops (trillions of calculations
per second) while simultaneously managing more than 650 terabytes
(trillions of bytes) of data. This awesome, data-intensive computing
power will be connected by a cross-country fiber-optic backbone
16 times faster than todays fastest research networks. All
of these components will be tightly integrated into an information
infrastructure dubbed the TeraGrid. "Breakthrough
discoveries in fields from genomics to astronomy depend critically
on computational and data management infrastructure as a first-class
scientific tool," said Fran Berman, director of NPACI and
SDSC and one of the two principal investigators of the TeraGrid
award (see story, p. 1). "The TeraGrid recognizes the increasing
importance of data-oriented computing and connection of data archives,
remote instruments, computational sites, and visualization over
high-speed networks. The TeraGrid will be a far more powerful
and flexible scientific tool than any single supercomputing system."
NATIONWIDE
COMPUTING FABRIC The four research institutions
in the DTF project are SDSC, the National Center for Supercomputing
Applications (NCSA) at the University of Illinois at Urbana-Champaign,
Caltech, and Argonne National Laboratory. Each institution has
played and will continue to play a key role in the NSFs
Partnerships for Advanced Computational Infrastructure (PACI)
program. This program is charged with meeting the expanding needs
of the U.S. academic community for high-end information technologies.
SDSC is the leading-edge site for NPACI, and Caltech is a key
NPACI partner. NCSA leads the National Computational Science Alliance
(Alliance), and Argonne is a major Alliance partner. The partnership expects
to work primarily with IBM, Intel Corporation, and Qwest Communications
to build the facility, along with Sun Microsystems, Myricom, and
Oracle Corporation. "Nothing like the DTF has ever been attempted
before. This will be the largest, most comprehensive infrastructure
ever deployed for open scientific research," said Dan Reed,
director of NCSA and the Alliance and a principal investigator
of the TeraGrid award. "Unprecedented amounts of data are
being generated by new observatories and sensors, and groups of
scientists are conducting new simulations of increasingly complex
phenomena. This new age of science requires a sustainable national
infrastructure that can bring together new tools, powerful computers,
and the best minds in the country. This is the national infrastructure
that will allow us to solve the most pressing scientific problems
of our time." The DTF will consist
primarily of clustered IBM servers based on Intel Itanium-family
processors connected with Myricoms Myrinet. Linux clusters
purchased through the DTF award and distributed across the four
DTF sites will total 11.6 teraflops of computing power. In addition,
two 1-teraflops Linux cluster systems already in use at NCSA will
be integrated into the DTF system, creating the 13.6-teraflops
systemthe most powerful distributed computing system ever.
Besides the worlds fastest unclassified supercomputers,
the DTFs hardware and software will include ultra-high-speed
networks, high-resolution visualization environments, and toolkits
for grid computing. Scientists and industry researchers across
the country will be able to tap into this infrastructure to solve
scientific problems. "The distributed
terascale facility will be a tremendous national resource,"
said NSF Director Rita Colwell. "With this innovative facility,
NSF will demonstrate a whole new range of capabilities for computer
science and fundamental scientific and engineering research, setting
high standards for 21st Century deployment of information technology." The clusters will operate
as a single distributed facility, linked via a dedicated optical
network that will initially operate at 40 gigabits per second
and later be upgraded to 50-80 gigabits per second. The DTF network,
developed in partnership with Qwest, will transport data 16 times
faster than the fastest research networks now in operation. It
will connect to Abilene, the high-performance network that links
more than 180 research institutions across the country, STAR TAP,
an interconnect point in Chicago that provides access to and from
international research networks, and CENICs CalREN-2, an
advanced high-speed network that connects institutions in California.
In Illinois, I-WIRE optical network will provide the DTF with
network capacity and will give Argonne and NCSA additional bandwidth
for related network-research initiatives. INFORMATION
TECHNOLOGY TRIATHLON The DTF architecture
demonstrates that the TeraGrid has been designed with much more
than sheer computing performance in mind. High-performance systems
have traditionally been designed for the computing equivalent
of a 100-meter sprintthe more flops (floating-point operations
per second), the better. But the TeraGrid is targeting an information
technology triathlonhuge amounts of online data storage
and network bandwidth as well as speedy computing performance. To ensure that the
DTF achieves its full potential, each of the four sites will play
a unique role in the project. SDSC will lead the TeraGrid data
and knowledge management effort by deploying a data-intensive
IBM Linux cluster based on Intel Itanium-family processors. This
system will have a peak performance of just over 4 teraflops and
225 terabytes of network disk storage. In addition, a next-generation
Sun Microsystems high-end server will provide a gateway to grid-distributed
data for data-oriented applications. NCSA will lead the
TeraGrid projects computational aspects with an IBM Linux
cluster powered by the next generation of Intel Itanium processors,
code-named McKinley. The clusters peak performance will
be 8 teraflops, combining the DTF-funded systems and other NCSA
clusters, with 240 terabytes of secondary storage. Caltech will focus
on providing online access to very large scientific data collections
and will facilitate access to those data by connecting data-intensive
applications to components of the TeraGrid. Caltech will deploy
a 0.4-teraflops IBM Itanium-family processor cluster and an IA-32
cluster that will manage 86 terabytes of online storage. "An exciting prospect
for the TeraGrid is that, by integrating simulation and modeling
capabilities with collection and analysis of huge scientific databases,
it will create a computing environment that unifies the research
methodologies of theory, experiment, and simulation," said
Paul Messina, director of Caltechs Center for Advanced Computing
Research and a TeraGrid co-principal investigator. Argonne will lead the
effort to deploy advanced distributed computing software, high-resolution
rendering and remote visualization capabilities, and networks.
This effort will require a 1-teraflops IBM Linux cluster with
parallel visualization hardware. ENABLING
NEXT-GENERATION SCIENCE "Supercomputing
traditionally has been associated with weather and aircraft design,"
said Rick Stevens, director of the Mathematics and Computer Science
Division at Argonne National Laboratory and TeraGrid co-principal
investigator. "Recent breakthroughs in chemistry and the
life sciences, however, have presented an even greater demand
for advanced computation. If we are to achieve the performance
necessary to support these new applications, we must develop capabilities
to harness the collective power of not only dozens of supercomputers,
but also thousands of individual PCs. The DTF will provide critical
insight into building such systems, while immediately enabling
new classes of science." The TeraGrid will enable
scientists to answer the next generation of science questions.
Researchers will be able to swiftly look for meaningful relationships
across scientific disciplines to gain powerful new insights into
everything from human diseases and climate change, to earthquake
prediction and the evolution of the universe. DH 
|
Principal Investigators
Fran Berman
SDSC
Dan Reed
NCSA
Co-Investigators
Rick Stevens,
Ian Foster
ANL
Paul Messina
Caltech
|