|
By Mike Gannis
PROJECT
LEADER
Philip Papadopoulos
SDSC
PARTICIPANTS
Greg Bruno,
Mason Katz,
Bill Link,
David McIntosh, Federico Sacerdoti
SDSC
Philip Buonodanna, Bret Chun,
David Culler,
Eric Fraser,
Albert Goto,
Matt Massie
Millennium Group,
UC Berkeley
Laurence Liew,
Najib Ninaba
Linux Competency
Centre in SCS Enterprise Systems Pte. Ltd.,
Singapore
Putchong Uthayopas
The Open Scalable Cluster Environment, Thailand
|
|
|
Researchers with the National Partnership for Advanced Computational
Infrastructure (NPACI) at the San Diego Supercomputer Center
(SDSC) marked significant milestones this spring for the NPACI
Rocks toolkit, which enables colleges, universities, research
institutes, and other organizations to easily set up and manage
powerful yet inexpensive cluster computers. As of June 26,
the peak processing speed of 110 cluster computers administered
by users of the NPACI Rocks software suite exceeded 17.6 teraflops
(trillion floating point operations per second). In June 2003,
four clusters administered with NPACI Rocksat Dell Computer
Corp., Hong Kong Baptist University, Stanford University,
and Scripps Institution of Oceanographywere included
in the Top500 list of the world's most powerful supercomputers.
Researchers around the world are using that computing power
to develop new nanomaterials, model population dynamics, predict
urban water supplies, and perform many other research and
education functions.
"The aggregate power of the clusters running NPACI Rocks
is in the same league as the largest supercomputer systems
in the world," said Philip Papadopoulos, program director
for SDSCs Grid and Cluster Computing group. "The
number of systems and processors demonstrates the acceptance
of NPACI Rocks in the user community."
Commodity clusters based on PC-type processors (Beowulf clusters)
provide impressive power, considering their low cost. However,
managing the clustersensuring that all of the nodes
have a consistent set of software when patches and new versions
of the operating system, utilities, and tools are releasedcan
burden system administrators. Unfortunately, the costs of
not managing a cluster can be even more expensive if security
holes and known bugs in the system software are not patched.
"SDSCs Cluster
Group and UC Berkeleys Millennium
group began to work together on the NPACI clusters project
three years ago," Papadopoulos said. "Our constant
goal has been to make clusters easy to deploy, manage, upgrade,
and scale."
The following summaries illustrate the variety of ways in
which clusters built with NPACI Rocks are advancing scientific
research:
Stanford University
The Bio-X
Project is Stanfords set of university-wide collaborations
in basic, applied, and clinical sciences. The program brings
together engineering, physics, chemistry, and the information
sciences with biology and medicine to foster discoveries and
inventions. The Schools of Engineering, Medicine, Humanities
and Sciences, and Earth Sciences teamed up to form the new
program.
The project has deployed a Pentium 4 cluster called Iceberg
with 302 dual-processor nodes; it runs at more than 3.3 teraflops.
"When I evaluated NPACI Rocks, it seemed to answer all
my questions and delivered the solution that met everyones
needs," said Bio-X system administrator Steve Jones.
"I can attest to the ease of management by using the
Rocks distribution. We now are considering our next cluster;
at this point, we think we will go with a 600-node dual-processor
system, twice the size of our existing one. Given the successes
that weve had so far, NPACI Rocks will of course be
what we use to manage the next evolution of Iceberg."
Singapore Computer Systems
The Linux Competency Centre at Singapore
Computer Systems (SCS-LCC) has set up a new 60-processor
Itanium 2 cluster for the Singapore-MIT Alliance (SMA) at
the National University of Singapore, its third cluster running
NPACI Rocks. The new "Hydra III" cluster supports
projects ranging from computational fluid dynamics to bio-engineering.
About 50 SMA researchers and post-graduate students use the
system. The cluster consists of 15 HP rx5670 nodes, each with
four Itanium 2 processors, and is interconnected with a high-performance,
high-bandwidth, low-latency switching system from Myrinet.
Hydra III achieves 240 gigaflops, which is about 70 percent
of theoretical peak processing power.
"The team took less than a day to install the cluster
with Rocks and get the cluster operational," said Laurence
Liew, a principal investigator at SCS. "This is a testimony
to the amount of work that has gone into making Rocks one
of the best and easiest to use cluster toolkits in the world."
"SCS Linux Competency Centre collaborates closely with
SDSC on NPACI Rocks and provides critical support in the areas
of file systems and queuing systems," said SDSCs
Papadopoulos. "The Rocks user community benefits greatly
from SCSs expertise and its significant contributions
to this community toolkit."
"We are very pleased with the performance and ease of
management of the Rocks-based Itanium 2 cluster," said
Khoo Boo Cheong, a professor and program co-chair of High
Performance Computation for Engineered Systems at SMA. "We
intend to encourage more researchers to migrate to Hydra III
over the next few months."
Scripps Institution of Oceanography,
UC San Diego
The Digital Image Analysis
Labs PIPE (Parallel Image Processing Environment)
cluster supports a wide variety of Earth systems science studies
and applications, including those involving Earth-observing
satellite data analysis, NASAs Direct Broadcast program,
snow hydrology and accurate water supply prediction, climate
studies (especially related to the cryosphere), sea ice, airborne
volcanic ash detection, agricultural applications, and basic
and applied atmospheric and remote sensing sciences. The cluster
supports a variety of disciplines in addition to Earth sciencesmathematics,
computer science, spatial statistics, signal analysis, and
electrical engineering, neural network analysis and other
classification methods.
PIPE will be upgraded from 74 Pentium4 Xeon processors, with
a speed of 355 gigaflops (billion floating point operations
per second), to 98 processors in 2003. The current system
has 98 gigabytes of memory, 10 terabytes of disk storage,
and a 12-terabyte SDLT tape library. "Rocks has been
exceptionally helpful in our cluster implementation,"
said Jim Simpson, the groups principal investigator.
"The price-performance ratio we are getting out of our
Rocks cluster is staggering," said Tim McIntire, lead
programmer and system administrator. "We are able to
process massive satellite data sets in several hours that
used to require over a month."
University of Texas at Austin
The Center for Subsurface Modeling (CSM) in the Institute
for Computational Engineering and Science (ICES) at the University
of Texas at Austin uses the 90-processor "Bevo"
cluster to model the behavior of fluids such as petroleum
and water in permeable geologic formations and in shallow
bodies of water.
"NPACI rocks is a fantastic way to administer a cluster,"
said ICES networking systems analyst Ari Berman. "The
tools included in this package are full featured and make
life for a system administrator easier. The most valuable
aspect of Rocks is that the node installs are disposable,
and reinstallation is extremely fast and easy. Additionally,
the NPACI Rocks community has been extremely supportive and
can usually answer any question via a mailing list."
The 44-processor "Jupiter" cluster at UTs
Institute for Advanced
Technology is used to model and assess systems which are
characterized by mechanically, thermally coupled electromagnetic
diffusive processes with moving conductors such as pulsed
rotating power supplies and electromagnetic launchers.
The 16-processor Pluto cluster at UTs Institute for
Advanced Technology is used to model ballistic and shock physics
problems, and to analyze hypervelocity impact physics. "The
Pluto cluster has had hardware and software errors of every
kind," said system administrator Jared Hodge. "Finally,
using Rocks and good systems administration procedures, we
were able to track down the last of the hardware and software
problems."
Hong Kong Baptist
University
In early 2003 the was established at Hong Kong Baptist University (HKBU) with
the support of Teaching Development Grants from the Hong Kong
University Grants Committee, Dell Corporation (Hong Kong),
Intel Corporation (Hong Kong), and the university’s
Faculty of Science. The purpose of the facility is to provide
teaching and research opportunities in parallel and distributed
computing for students from various academic institutions
in Hong Kong. Students learn how to compute in a parallel
environment in laboratory demonstrations and exercises. Undergraduate
students are using the PC cluster to work on senior-year projects.
Research groups of professors, visiting scholars, and Ph.D.
students are using the cluster to carry out projects in molecular
modeling, Monte Carlo methods, statistical physics, and other
areas.
“We had used Rocks successfully in a
16-node fast Ethernet Pentium III cluster,” said system
administrator Morris M.M. Law. “When setting up our
new 64-node gigabit Ethernet P4-Xeon cluster we were very
pleased to be able to install Rocks on the master and slave
nodes within two hours. We hope that Rocks can help us bring
PC-cluster computing into the mainstream of our teaching and
research here at HKBU.”
University of
Macedonia, Greece
Students and staff in the Parallel
Distributed Processing Laboratory within the Department
of Applied Informatics at the University
of Macedonia use the 65-processor “Electra”
cluster to do performance modeling of scientific computations
on heterogeneous systems, fault tolerance in scientific computations,
parallel evolutionary computations, approximate string matching,
and non-linear optimization. The server is also connected
to the university’s backbone network, providing remote
job submission and monitoring
through the Internet from the lab’s Web site.
The nodes are slow, outdated PCs (100 to 233
MHz Pentium chips) that had been withdrawn from regular office
and lab use. The interconnection network is a two-level tree
structure of 100 Mbps Ethernet switches. The operating system
software is based on Red Hat Linux, NPACI Rocks, and the Ganglia
Toolkit.
“Before using NPACI Rocks we developed
two smaller clusters with 8 to 16 nodes,” said system
administrator Bill Stefanidis. “In retrospect, we think
it would be impossible to implement a 65-node cluster without
the help of Rocks, which proved quite easy to install, configure,
use, and extend.”
University of South Carolina
The laboratory
of Kevin Higgins, an assistant professor in the Department
of Biological Sciences at the University of South Carolina,
uses a cluster for computational research in population biology,
to simulate the genetic and demographic mechanisms controlling
the extinction of natural populations. A primary research
goal is to identify the types of wildlife that may be particularly
vulnerable to declines in population fitness due to the accumulation
of deleterious mutations.
South Carolinas "Extinction Machine" cluster
consists of 97 dual-processor AMD Athlon nodes with a peak
speed of 671 gigaflops connected by a Hewlett-Packard ProCurve
5372xl switch. "The Rocks parallel computing environment
is extremely stable and provides all the tools needed for
true high performance computing," said Higgins.
Texas Tech University
The High Performance Computing
Center at Texas Tech currently manages and maintains four
Beowulf clusters that use NPACI Rocks clustering software.
Some researchers at the university use the clusters in computational
chemistry calculations in serial and parallel, while other
researchers calculate orbital forces between atoms.
Texas Tech has incorporated the clusters running NPACI Rocks
into TechGrid, a campus-wide grid of Windows PC, Linux, and
Unix systems with a single access point. Adding the Rocks
systems to this Grid was relatively uncomplicated, said David
Chaffin, administrator of the four Rocks clusters.
University of Tromsø, Norway
The "Snowstorm"
cluster at the University of Tromsøs High Performance
Computing Project at the Computer Center is a general-purpose
resource for scientists and students in Norway. It is part
of the computing resources available to academic users from
NOTUR, the national high-performance computing effort in Norway.
"We see a high demand for the clusters performance
in fields like pharmacology and bioinformatics, both as a
parallel computing engine and as a throughput machine running
single-CPU jobs," said system administrator Roy Dragseth.
"NPACI Rocks gave us a turnkey solution that worked out
of the box. It was a real time-saver when we started with
clusters and it has proven its stability and usability in
our production environment with scientists and students running
very demanding tasks. The fact that Rocks is based on standard
Red Hat Linux makes it fit extremely well into our departments
other IT infrastructure."
|