by Francine Berman, Director,
San Diego Supercomputer Center
This special issue of EnVision is devoted to the NPACI
Grid. The NPACI Grid is the culmination of efforts by
NPACI partners for more than a year and a half to coordinate
NPACI resources and software to provide a capable, high-capacity
virtual environment for science and engineering applications.
Starting from a collection of heterogeneous resources
(SDSCs Blue Horizon, TACCs Longhorn, U.
of Michigans Opteron cluster), national commodity
networks (Internet 2s Abilene, CalREN-2 and ESnet),
and a ready-for-prime-time collection of software (the
NPACKage), we set out to build a grid from
the bottom up. The accomplishment, highlighted in this
months special issue of Envision, has provided
NPACI partners and collaborators critical experience
with grid technologies fundamental for NSFs evolving
What is the NPACI Grid and why is it important?
The NPACI Grid is an early deployment of a production,
heterogeneous national grid consisting of interoperable
software, scientific applications, and hardware resources
located at NPACI and potentially other resource sites.
The goals of the project are fourfold:
To provide increased access and capability to a
national science and engineering community through
the coordination of a heterogeneous collection of
national resources at partner sites.
To unify major NPACI software efforts and other
infrastructure efforts into an all-to-all interoperable,
tested, documented and supported virtual software
To work closely with science communities to develop
To provide NPACI partners experience with cyberinfrastructure
by developing a production-level, usable national
grid from existing resources.
The NPACI Grid is important in two dimensions: First,
it provides key experience with heterogeneous collections
of resources, representative of the vast majority of
grid projects at scale. For example, the hardware resources
comprising the NPACI Grid belong to four different administrative
domains, each with their own rules, regulations, requirements,
and accounting procedures. The resources are heterogeneous
and distributed: a 1.7 Teraflop AIX cluster at SDSC,
two AMD-based Linux clusters at the University of Michigan
with 356 combined processors, and three large shared-memory
server nodes at TACC delivering 1160 Gflops. Such locally
based resources are common and grid technologies provide
a venue for increasing local capability, capacity and
functionality through aggregation.
Second, the focus of software development for the NPACI
Grid has been usability and interoperability via the
development of a package of grid software which has
been tested, documented, and help-desk-supported for
a broad national user community. As grid technologies
become more and more prevalent, the success of production
grids are ultimately dependent on how well they serve
user communities. The development and availability of
online and help-desk support, usable APIs, tested and
documented releases, etc., is critical to making the
grid usable by a broad community of users. The NPACI
Grid will provide first-hand experience with usability
issues critical to the grids success.
The NPACI Grid and Cyberinfrastructure
The NPACI Grid project, along with TeraGrid/ETF, emerging
data grids and other grid projects, are providing invaluable
experience with building, deploying, using and evolving
the basic components of cyberinfrastructure. As our
community seeks to coalesce around NSFs emerging
Cyberinfrastructure Initiative and other national programs,
it is critical that users, researchers, developers,
scientists and technologists gain real experience with
the complex and interdependent software systems that
will underlie cyberinfrastructures advanced capabilities.
Community projects such as the NPACI Grid, TeraGrid/ETF,
PRAGMA, Alliance Grid efforts, etc., as well as large-scale
science-driven, grid-enabled projects such as GEON,
GriPhyN/iVDGL, BIRN, MCell/Virtual Instrument, NEES,
GrADS, NVO and others are providing critical hands-on
experience which feeds back into the development of
grid infrastructure and technologies. These efforts
are also enabling the identification of key research
problems in computer science, policy, security and other
areas critical to cyberinfrastructure.
How do TeraGrid and NSFs National Middleware
Initiative (NMI) relate to the NPACI Grid?
The NPACI Grid effort complements the National Science
Foundations NSF Middleware Initiative (NMI) and
TeraGrid/ ETF project in scope and scale. NMI broached
new territory by providing a venue for integrating key
software packages critical for the grid. NPACI Grid
software uses NMI and other ready-for-prime-time software
as a base and focuses on user issues such as all-to-all
interoperability, documentation, testing, usability
TeraGrid and the NPACI Grid approach grid computing
from opposite directions. The initial TeraGrid (DTF)
project was conceived as a national production grid
comprising high-end clusters of uniform architecture
and immense storage, all connected via a high-speed
national network. TeraGrid provides a cutting edge production
environment for a broad set of applications, and most
especially applications which can take advantage of
immense on-line storage, distributed cluster computing,
and/or IA-64 architectures. Construction of the current
TeraGrid software stack has focused on the integration
of cutting-edge resources and leverages architectural
homogeneity. The initial project was top-down
in that major decisions about new hardware, networking,
and software were made uniformly for all sites.
NPACI Grid approaches grid development from the bottom
upthe NPACI Grid coordinates existing (and
heterogeneous) hardware, networks, and software. Both
projects are converging as TeraGrid evolves into the
Extensible Terascale Facility (ETF), bringing heterogeneous
new resources to TeraGrid as well as new partners to
the TeraGrid team, and as ETF prepares to merge with
PACI resources post- construction. Experience with heterogeneity
on the NPACI Grid and other projects will be most valuable
for ETF, and will help expedite the development of a
usable, heterogeneous production grid.
To facilitate integration of the NPACI Grid and TeraGrid/ETF,
the NPACI Grid team will work with the TeraGrid team
in FY04 to integrate and interoperate the NMI-based
NPACKage (NPACI Grid software) and the TeraGrid software
stack. At the end of 2004, we are hopeful that these
important national grid efforts and others will be able
to closely interoperate, providing real experience with
the development of a national grid of grids.
Introducing the NPACKage
At the heart of the NPACI Grid is the NPACKage, an
interoperable collection of 14 software components developed
by NPACI partners and national collaborators. NPACKage
is being deployed across major NPACI compute, data and
networking resources (as well as other sites, such as
the PRAGMA consortium) to form a consistent software
environment for the NPACI Grid. NPACKage unifies NPACI
and community infrastructure efforts and provides critical
experience with usability, interoperability and hardening
of grid software.
NPACKage components are based on mature software efforts
from NPACI partners and national collaborators and include:
The Globus Toolkit, the de facto standard for grid
computing and an open-source collection of modular
technologies that simplifies collaboration across
dynamic, multi-institutional virtual organizations.
GSI-OpenSSH, a modified version of OpenSSH that
adds support for GSI authentication, providing a
single sign-on remote login capability for the grid.
Network Weather Service (NWS), a distributed system
that periodically monitors and dynamically forecasts
the performance various network and computational
resources can deliver over a given time interval.
DataCutter, a core set of data services, on top
of which application developers can implement more
application-specific services or combine with existing
grid services such as metadata management, resource
management, and authentication services.
Ganglia, a scalable distributed monitoring system
for high-performance computing systems such as clusters
LAPACK for Clusters (LFC), a package that brings
the performance of ScaLAPACK and the expertise of
advance performance tuning to an average user familiar
with the LAPACK interface.
MyProxy, a credential repository for the grid.
GridConfig, software for generating and managing
configuration files across a large number of components
in a centrally controlled information system.
Condor-G, software which lets users submit jobs
into a queue, maintain detailed job logs, manage
input and output files, and serve as a comprehensive
job/resource management system.
Storage Resource Broker (SRB), client-server middleware
providing a uniform interface for connecting heterogeneous
data resources over a network and accessing replicated
Grid Portal Toolkit (GridPort), a collection of
technologies used for developing application-specific,
browser-based interfaces to the grid (to be released
in the next version of NPACKAge).
MPICH-G2, a grid-enabled version of MPI message-passing
library based on Globus. MPICH-G2 allows users to
couple machines, of different architectures, to
run MPI applications.
APST (AppLeS Parameter Sweep Template), software
automating the execution of parameter sweep applications
with potentially large data sets over grid resources.
Kx509, a standalone client program that acquires
a short-term X.509 certificate from the KCA for
a Kerberos-authenticated user.
NPACKage builds, expands and extends the common infrastructure
developed by the Grids Center Software Suite as part
of NMI. The NPACKage team has assembled a collection
of versions that is regularly tested for interoperability
on both internal and production systems (including the
various operating systems of NPACI resources). In addition,
user support for NPACKage has been integrated with the
NPACI trouble-ticket system to provide quick turnaround
The NPACI alpha projects (integrated science and technology
teams focused on inter-disciplinary infrastructure development)
are serving as early adopters for the NPACKage and NPACI
Grid. Alpha projects and other application developers
are substituting NPACKage software and services for
appropriate customized code in their applications and
targeting them to one or more resources in the NPACI
Grid. This provides experience with the vertical
integration between applications, software and
hardware so important to advancing grid technologies
and making the grid truly usable. Over the next year,
the experience of developing and executing NPACKage-enabled
applications on the NPACI Grid will help us get a better
sense of user needs and software viability and help
determine how to improve the NPACI Grid and NPACKage
software and services.
We Need Your Help
At the heart of any grid is a complex and dynamic software
system. We have much to learn about getting the details
right and making these systems most usable by the science
community. I encourage you to use the NPACI Grid and
NPACKage software and to give us your feedback. What
is working well and what needs to be improved? What
would you like to see in the environment that will better
support your work? Consultants and user services professionals
can be reached via the Web at www.npaci.edu, Consult,
by email at firstname.lastname@example.org, or by calling the toll-free
help line at 1-866-336-2357. Suggestions should also
be sent to email@example.com.
It takes a village to build a grid. Id like
to thank the NPACKage and NPACI Grid team, and NPACI
partners and collaborators for their hard work over
the last year and a half, and continuing commitment
and leadership. Thanks in advance to NPACI Grid users
and application developers for working with us to gain
the deep understanding necessary to make grid technologies
most usable by and useful to the science, engineering
and education communities.
Information on the NPACI Grid, including a Getting
Started Guide, is available at: npacigrid.npaci.edu
and npacigrid.npaci.edu/user_getting_started.html. Details
on NPACKage are available at: npackage.npaci.edu.