Skip to content

SDSC Homepage SDSC Homepage San Diego Supercomputer Center Contact EnVision NPACI: A Leading Edge Site San Diego Supercomputer Center
EnVison, Vol. 19 No.3
 
 
Features
   
  Building a National Grid
from the Bottom Up
   
 

Building the NPACI Grid:
Integrating the
Human Infrastructure

   
  Discovering Knowledge in Massive Data Collections
   
  Pacific Rim Group Evolves
into International Model
of Collaboration
   
 
News
   
Tree of Life and Virtual Grid Development among Four ITR Awards to SDSC
   
NSF Middleware Projects Receive $9 Million
   
NSF Awards $1.2 Million to Extend PRAGMA Program
   
Teachers Bring Technology They Developed at SDSC to Their Classrooms
   
Texas Installs Lonestar Cluster
   
SDSC Education Department Efforts Recognized in SIGKids Awards
 

Building a National Grid from the Bottom Up

 
Francine Berman, Director

by Francine Berman, Director,
San Diego Supercomputer Center

This special issue of EnVision is devoted to the NPACI Grid. The NPACI Grid is the culmination of efforts by NPACI partners for more than a year and a half to coordinate NPACI resources and software to provide a capable, high-capacity virtual environment for science and engineering applications. Starting from a collection of heterogeneous resources (SDSC’s Blue Horizon, TACC’s Longhorn, U. of Michigan’s Opteron cluster), national commodity networks (Internet 2’s Abilene, CalREN-2 and ESnet), and a ready-for-prime-time collection of software (the “NPACKage”), we set out to build a grid from the bottom up. The accomplishment, highlighted in this month’s special issue of Envision, has provided NPACI partners and collaborators critical experience with grid technologies fundamental for NSF’s evolving cyberinfrastructure.

What is the NPACI Grid and why is it important?

The NPACI Grid is an early deployment of a production, heterogeneous national grid consisting of interoperable software, scientific applications, and hardware resources located at NPACI and potentially other resource sites. The goals of the project are fourfold:

  • To provide increased access and capability to a national science and engineering community through the coordination of a heterogeneous collection of national resources at partner sites.

  • To unify major NPACI software efforts and other infrastructure efforts into an all-to-all interoperable, tested, documented and supported virtual software layer.

  • To work closely with science communities to develop grid-enabled applications.

  • To provide NPACI partners experience with cyberinfrastructure by developing a production-level, usable national grid from existing resources.

The NPACI Grid is important in two dimensions: First, it provides key experience with heterogeneous collections of resources, representative of the vast majority of grid projects at scale. For example, the hardware resources comprising the NPACI Grid belong to four different administrative domains, each with their own rules, regulations, requirements, and accounting procedures. The resources are heterogeneous and distributed: a 1.7 Teraflop AIX cluster at SDSC, two AMD-based Linux clusters at the University of Michigan with 356 combined processors, and three large shared-memory server nodes at TACC delivering 1160 Gflops. Such locally based resources are common and grid technologies provide a venue for increasing local capability, capacity and functionality through aggregation.

Second, the focus of software development for the NPACI Grid has been usability and interoperability via the development of a package of grid software which has been tested, documented, and help-desk-supported for a broad national user community. As grid technologies become more and more prevalent, the success of production grids are ultimately dependent on how well they serve user communities. The development and availability of online and help-desk support, usable APIs, tested and documented releases, etc., is critical to making the grid usable by a broad community of users. The NPACI Grid will provide first-hand experience with usability issues critical to the grid’s success.

The NPACI Grid and Cyberinfrastructure

The NPACI Grid project, along with TeraGrid/ETF, emerging data grids and other grid projects, are providing invaluable experience with building, deploying, using and evolving the basic components of cyberinfrastructure. As our community seeks to coalesce around NSF’s emerging Cyberinfrastructure Initiative and other national programs, it is critical that users, researchers, developers, scientists and technologists gain real experience with the complex and interdependent software systems that will underlie cyberinfrastructure’s advanced capabilities. Community projects such as the NPACI Grid, TeraGrid/ETF, PRAGMA, Alliance Grid efforts, etc., as well as large-scale science-driven, grid-enabled projects such as GEON, GriPhyN/iVDGL, BIRN, MCell/Virtual Instrument, NEES, GrADS, NVO and others are providing critical hands-on experience which feeds back into the development of grid infrastructure and technologies. These efforts are also enabling the identification of key research problems in computer science, policy, security and other areas critical to cyberinfrastructure.

How do TeraGrid and NSF’s National Middleware Initiative (NMI) relate to the NPACI Grid?

The NPACI Grid effort complements the National Science Foundation’s NSF Middleware Initiative (NMI) and TeraGrid/ ETF project in scope and scale. NMI broached new territory by providing a venue for integrating key software packages critical for the grid. NPACI Grid software uses NMI and other ready-for-prime-time software as a base and focuses on user issues such as all-to-all interoperability, documentation, testing, usability and support.

TeraGrid and the NPACI Grid approach grid computing from opposite directions. The initial TeraGrid (DTF) project was conceived as a national production grid comprising high-end clusters of uniform architecture and immense storage, all connected via a high-speed national network. TeraGrid provides a cutting edge production environment for a broad set of applications, and most especially applications which can take advantage of immense on-line storage, distributed cluster computing, and/or IA-64 architectures. Construction of the current TeraGrid software stack has focused on the integration of cutting-edge resources and leverages architectural homogeneity. The initial project was “top-down” in that major decisions about new hardware, networking, and software were made uniformly for all sites.

NPACI Grid approaches grid development from the “bottom up”—the NPACI Grid coordinates existing (and heterogeneous) hardware, networks, and software. Both projects are converging as TeraGrid evolves into the Extensible Terascale Facility (ETF), bringing heterogeneous new resources to TeraGrid as well as new partners to the TeraGrid team, and as ETF prepares to merge with PACI resources post- construction. Experience with heterogeneity on the NPACI Grid and other projects will be most valuable for ETF, and will help expedite the development of a usable, heterogeneous production grid.

To facilitate integration of the NPACI Grid and TeraGrid/ETF, the NPACI Grid team will work with the TeraGrid team in FY04 to integrate and interoperate the NMI-based NPACKage (NPACI Grid software) and the TeraGrid software stack. At the end of 2004, we are hopeful that these important national grid efforts and others will be able to closely interoperate, providing real experience with the development of a national “grid of grids”.

Introducing the NPACKage

At the heart of the NPACI Grid is the NPACKage, an interoperable collection of 14 software components developed by NPACI partners and national collaborators. NPACKage is being deployed across major NPACI compute, data and networking resources (as well as other sites, such as the PRAGMA consortium) to form a consistent software environment for the NPACI Grid. NPACKage unifies NPACI and community infrastructure efforts and provides critical experience with usability, interoperability and hardening of grid software.

NPACKage components are based on mature software efforts from NPACI partners and national collaborators and include:

  • The Globus Toolkit, the de facto standard for grid computing and an open-source collection of modular technologies that simplifies collaboration across dynamic, multi-institutional virtual organizations.

  • GSI-OpenSSH, a modified version of OpenSSH that adds support for GSI authentication, providing a single sign-on remote login capability for the grid.

  • Network Weather Service (NWS), a distributed system that periodically monitors and dynamically forecasts the performance various network and computational resources can deliver over a given time interval.

  • DataCutter, a core set of data services, on top of which application developers can implement more application-specific services or combine with existing grid services such as metadata management, resource management, and authentication services.

  • Ganglia, a scalable distributed monitoring system for high-performance computing systems such as clusters and grids.

  • LAPACK for Clusters (LFC), a package that brings the performance of ScaLAPACK and the expertise of advance performance tuning to an average user familiar with the LAPACK interface.

  • MyProxy, a credential repository for the grid.

  • GridConfig, software for generating and managing configuration files across a large number of components in a centrally controlled information system.

  • Condor-G, software which lets users submit jobs into a queue, maintain detailed job logs, manage input and output files, and serve as a comprehensive job/resource management system.

  • Storage Resource Broker (SRB), client-server middleware providing a uniform interface for connecting heterogeneous data resources over a network and accessing replicated data sets.

  • Grid Portal Toolkit (GridPort), a collection of technologies used for developing application-specific, browser-based interfaces to the grid (to be released in the next version of NPACKAge).

  • MPICH-G2, a grid-enabled version of MPI message-passing library based on Globus. MPICH-G2 allows users to couple machines, of different architectures, to run MPI applications.

  • APST (AppLeS Parameter Sweep Template), software automating the execution of parameter sweep applications with potentially large data sets over grid resources.

  • Kx509, a standalone client program that acquires a short-term X.509 certificate from the KCA for a Kerberos-authenticated user.

NPACKage builds, expands and extends the common infrastructure developed by the Grids Center Software Suite as part of NMI. The NPACKage team has assembled a collection of versions that is regularly tested for interoperability on both internal and production systems (including the various operating systems of NPACI resources). In addition, user support for NPACKage has been integrated with the NPACI trouble-ticket system to provide quick turnaround to users.

Early Adopters

The NPACI alpha projects (integrated science and technology teams focused on inter-disciplinary infrastructure development) are serving as early adopters for the NPACKage and NPACI Grid. Alpha projects and other application developers are substituting NPACKage software and services for appropriate customized code in their applications and targeting them to one or more resources in the NPACI Grid. This provides experience with the “vertical integration” between applications, software and hardware so important to advancing grid technologies and making the grid truly usable. Over the next year, the experience of developing and executing NPACKage-enabled applications on the NPACI Grid will help us get a better sense of user needs and software viability and help determine how to improve the NPACI Grid and NPACKage software and services.

We Need Your Help

At the heart of any grid is a complex and dynamic software system. We have much to learn about getting the details right and making these systems most usable by the science community. I encourage you to use the NPACI Grid and NPACKage software and to give us your feedback. What is working well and what needs to be improved? What would you like to see in the environment that will better support your work? Consultants and user services professionals can be reached via the Web at www.npaci.edu, Consult, by email at consult@npaci.edu, or by calling the toll-free help line at 1-866-336-2357. Suggestions should also be sent to consult@npaci.edu.

It takes a village to build a grid. I’d like to thank the NPACKage and NPACI Grid team, and NPACI partners and collaborators for their hard work over the last year and a half, and continuing commitment and leadership. Thanks in advance to NPACI Grid users and application developers for working with us to gain the deep understanding necessary to make grid technologies most usable by and useful to the science, engineering and education communities.
Useful links:

Information on the NPACI Grid, including a Getting Started Guide, is available at: npacigrid.npaci.edu and npacigrid.npaci.edu/user_getting_started.html. Details on NPACKage are available at: npackage.npaci.edu.