SRB & SrbRack
Components of a Virtual Data Grid Architecture
Arcot Rajasekar, Michael Wan
San Diego Supercomputer Center
University of California, San Diego
Keywords: data grids, resource broker, distributed data management, virtualization, metadata.
The “Grid” is a term used for defining a variety of notions linking multiple computational resources such as people, computers, sensors and data [Foster & Kesselman 1999]. The term “Data Grid” has come to denote a network of storage resources, from archival systems, to caches, to databases, that are linked across a distributed network using common interfaces. Examples of implementations of data grids can be found in the physics community [PPDG 1999, Hoschek 2001, GriPhyN 2000], biomedical applications (BIRN 2001) and ecological sciences [KNB 1999]. More recently several projects have been started for establishing data grids for other communities such as astronomy [NVO 2001], earthquake and multi-sensor systems [NEES 2000, RoadNet 2001], etc. Most of these data grids are under construction and are in the process of studying different proto-typical systems for achieving the goals
All of the above communities are faced with the challenge of dealing with an explosion of data – an explosion that will grow data collections from the 100 Terabyte to the 100 Petabyte scale in size and to billions of data objects in numbers, over the next decade. Moreover, the research community is faced with the problems of disseminating these large collections of data to users and applications, providing a collaboratory environment for analyzing and performing data-intensive computing, and at a basic level with problems of managing, curating, storing and moving such large quantities of data. The evolution of the data grid provides solutions to these problems by developing middleware that can seamlessly integrate multiple data resources and provide a uniform method of accessing data across a virtual network space.
Many of the data grid applications currently under development and deployment use the SDSC Storage Resource Broker [SRB 2000] as a middleware for integrating data repositories and for providing seamless access to distributed data in multiple storage systems. The SDSC Storage Resource Broker (SRB) is a client-server middleware that provides a replicated collection-management across multiple storage systems [SRB 2001, Moore & Rajasekar 2001]. The SRB provides a means to organize information from multiple heterogeneous systems into logical collections for ease of use. The SRB, in conjunction with the Meta data Catalog [MCAT, 2000], supports location transparency by accessing data sets and resources based on their attributes rather than their names or physical locations [Baru et.al., 1998]. The SRB provides access to archival resources such as HPSS, UniTree and ADSM, file systems such as Unix File System, NT File System and Mac OSX File System and databases such as Oracle, DB2, and Sybase. The SRB provides a logical representation for storage systems, digital file objects, and collections and provides several features for use in digital libraries, persistent archive systems or a collection management systems. SRB also provides capabilities to store replicas of data, for authenticating users, controlling access to documents and collections, and auditing accesses. The SRB can also store user-defined metadata at the collection and object level and provides search capabilities based on these metadata.
SRB is a federated server system, with each SRB managing/brokering a set of storage resources. It is also possible to configure the system such that a storage resource is managed by more than one SRB server (possibly co-located). This provides support for fault-tolerance in case one of the controlling SRB servers fails. The federated SRB implementation provides unique advantages:
1. Location transparency - Users can connect to any SRB server to access data from any other SRB
2. Improved reliability and availability - data may be replicated in different storage systems on different hosts under control of different SRB servers [Baru et.al., 1998a].
3. Logistical and administrative reasons - different storage systems may need to run on different hosts under different security protocols, thus making a distributed environment a necessity;
4. Load Balancing and Improved Access - a single data resource may become a bottleneck and distributing and replicating the datasets will allow for load balancing.
5. Data placement, caching and archiving – SRB provides a means for moving and copying data around for optimal placement of data in distributed caches as well as providing seamless access to archival copies.
6. Uniform User name space and Data name space – SRB provides a single-sign on authentication for users as well as a single-point third-party authorization (access control) for data, collection, resources and methods.
The SRB has been implemented on multiple platforms including IBM AIX, Sun, SGI, Linux, Cray T3E and C90, Windows NT, 2000, Me, Mac OSX, etc. The SRB has been used in several efforts to develop infrastructure for GRID technologies, including the Biomedical Information Research Network (NIH) Particle Physics Data Grid (NSF/DOE), Information Power Grid (NASA) and GrPhyN (NSF). The SRB also has been used for handling large-scale data collections, including the Digital Sky Survey Collection for 2MASS data (10 TB of 5 million files), NPACI datasets, and the Digital Embryo collection (20 TB leading up to 500 TB)and LTER hyper-spectal datasets.
Even though the SRB system of servers can be installed and run on existing host machines and resources, it may be advantageous to deploy a system expressly dedicated for storing and serving data through the SRB. We call such a system as a SRB Virtual Data Grid (SVDG). The SVDG is made of physical components called SrbRacks which are dedicated host machines on the internet brokering multiple data-media resources from file systems to archival systems to databases to web sites. The SrbRacks are logically interconnected into a single SrbSpace using a common MCAT metadata catalog. The SrbRack sites will be located at geographically distributed sites and form a partnership in disseminating information.
The requirements of the SVDG can be enumerated as follows:
· Seamless access to data and information stored at the and partner sites of the SVDG. The data from various collections at participating sites will be stored at local data caches. Researchers at remote site should be able to access those data transparently was though they are accessing a local dataset. In this manner, one can run visualization applications accessing data from multiple sites as though they are coming from the local site.
· Virtual organization structure for data and information based on a digital library framework. Even though data will be stored at multiple sites, it would help users if the directory structures of the data are organized according to some logical (context-dependent) structure with an easy navigational aid. Hence, the SVDG has to provide means to group data into collections (actually hierarchies of collection) and provide management facilities for the same.
· Handle Dataset Scaling in size and number. The sizes and numbers of datasets involved in any SVDG will grow in the coming years Hence any solution for the virtual data grid should be scalable. SRB has proven to handle millions of datasets of several Tera Bytes as well as large files of tens of Gigabytes in size. The SRB also provides the facility for aggregating data into physical blocks called containers and using these blocks for facilitating data movement and caching.
· Integrate Data Collections and Associated Metadata. As part of the any SVDG, we assume that more than one collection of data will be assembled. These collections need to be integrated using collection-level metadata in collaboratory fashion.
· Replication of Data: For reasons of fault tolerance, disaster recovery and load balancing it will be useful for data to be replicated into distributed resources. Moreover, the consistency of the replicas should be maintained with very little effort on the part of the users.
· Handle Multiplicity of Platforms, Resource & Data Types. The SVDG network should handle e diverse computational and storage resources. In the network, one should be able to use a super computer such as the ‘Blue Horizon’ or a desktop system (say an SGI) or a lap top running Linux or Windows OS. From any of these platform the user should be able to access the data in the grid with ease using very similar access methods and APIs. Moreover, the data in the SVDG will be stored on different types of resources, possibly in archival storage systems such as HPSS, file systems (Unix or NTFS) and data bases (Oracle, Sybase, DB2, etc.). The SVDG should be able to provide seamless access to data located in heterogeneous resources using very similar access methods and APIs.
· Handle Seamless Authentication. Since a user can be expected to access data from any of the sites, it would be difficult if the suer is needed to obtain login/permission at every site and access to every resource at the site. The SVDG should be able to seamlessly provide access to the user to all the sites with a single sign on authentication.
· Handle Access Control and provide Auditing Facilities. In some community data need to be guarded so that access to them are given only to selected and relevant people. Moreover this selection should be done by the owner of the data. The SVDG should be able to control levels of access at multiple levels (collections, datasets, resources, etc) for users and user groups beyond that offered by file systems. Moreover, in some cases, it may be necessary to audit usage of the collections/datasets. Hence, auditing facilities will be needed as part fo the framework.
· Version Control: Since datasets may evolve over time, providing a distributed version control will help in collaborative data sharing. SRB provides facilities for locking and version control.
A SVDG is made of logically interconnecting nodes of physical data systems. We propose an architecture using SrbRacks which are dedicated systems built for storing and brokering SRB data and information. As mentioned earlier the SrbRacks will be located at geographically distributed sites and form a partnership.
For each SRB server system there will be two kinds of racks.:
1. McatRack which implements the MCAT for the SRBspace will be installed at one site (possibly replicated at a few more sites) and,
2. SrbRack, which implements SRB servers and brokers storage will be installed in many.
The McatRack will be a dual processor system running Linux operating system with 2 GB of memory. A database system (such as Oracle, DB2 or Sybase, …) will be installed on this system to hold the MCAT metadata. The system will have a minimum of 256 GB of RAID 5 storage for running the MCAT. This system will not be used for brokering any storage. The system will also have a single DLT tape drive for backing up the Oracle database. If necessary, for fail-safe purposes, a complimentary system of similar description will be run alongside to provide for tandem database installation. Other options may also be available for increasing the number of CPUs and running parallel versions of the databases. The system will have hardware to provide sufficient bandwidth out of the rack to the net, as well as a UPS.
The McatRack might be set up at few other sites for load balancing and fault tolerance. In such a case, one of the McatRacks will function as a primary site and the others will be used for read-only purposes in a slave-mode. Database replication mechanisms of vendor databases wil be used to perform replication functionalities. We do not consider to run these replicated database in concurrent mode because of performance consideration. Instead, the plan will be to run them in a delayed update mode with the slave-databases synchronizing at regular intervals.
The SrbRack will be a single processor system running Linux operating system with 1 GB of memory.
The system will also have at least 1 TB of storage space as local file storage for the system (these storage should have very high bandwidth). Apart from this, the system may have other network attached storage systems (NAS) with several Tera Bytes of storage using high-speed disk. The SrbRack may also
broker other types of storage systems including databases. We do not envisage brokering an archival tape system through the srbRack even though that option is available. Instead, we think that using only spinning disk storage will provide high availability and fast response times for the users. Since the cost of such systems are getting lower by the day, it is an economically a viable option. Moreover, since data will be replicated at more than one site, it may not be required to have any archival support. The number of CPUs may be increased as required if sufficient server-side computation is planned.
To make the srbRack or the McatRack self sufficient, one may include network routers and network monitors as part of the rack. One such configuration of the SrbRack used by BIRN project [BIRN 2001] is shown in Figure 1.
Deploy: The deployment of the SVDG will take part concurrently with the deployment of the McatRack at the SRB Coordinating Center (CC) site and srbRacks at each Participating Site (PS). At the CC site, the central catalog of the SRB, MCAT, will be deployed over an Oracle database system. The SRB Administration facility will also be located at the CC. The SRB Administration facility will be used for registration of physical resources, creation of logical resources and maintaining their access control, creation of new users, logical user domains, and user groups with requisite authentication. The CC will also have monitoring tools for checking the health of the various components of the SVDG. Continuous checking to see whether remote resources are running out of space, whether any of the resource servers have gone off line will be performed from the SRB Administration at the CC. Moreover, the SRB Administration site will also coordinate preventive maintenance schedules to make sure that critical resources do not go offline at the same time. During the deployment, the database engineers at each Participating Site will be trained to perform essential resource administration and tuning at the PS. The MCAT at the Coordinating Center will be running on an Oracle database system under the management of a database administrator (DBA).
Consult: Once the SVDG has been partially deployed, the tasks of making users and resources aware of its capabilities become important. SRB is feature-rich and has multiple interfaces and requires the teaching of the users for optimal usage of SRB. The consulting group will help the users in getting registered and setting up their SRB environments at the Participating Sites. They will help in organizing their collections, usage groups, defining logical resources for their usage and setting containers for storage. The consulting team will also be involved in teaching and answering questions regarding the usage of the SRB APIs, GUIs, command line interfaces and web access methods. The group will also be instrumental in teaching the researchers about data ingestion into SRB and users about optimal access patterns from the SRB. The group will conduct tutorials, take on-line and telephone questions and also make the users aware of new developements and bug fixes. The new users will also need help in setting up their web-based access and scripting and the consulting group will help in this regard. The group will also consult on resource allocation strategies, data placement strategies, cache, disk and archival system issues, data migration strategies, etc to enhance application performance with SRB. They will also help in collection creation, curation, and management issues.
Customize & Integrate: Once the SVDG has been setup and users are aware of its usage, the next step would be to customizing legacy methods and developing new programs to use the SVDG to access distributed data. SRB has APIs for several languages (C, C++, Java and Perl-through C) and libraries for access through the web. The SVDG group will be able to help the users in writing new code and modifying existing code (if required) to access data. Customization to access diverse data collections also be required when multiple data collections across sites and across the three projects in the initial user environment. These customizations will be performed with close cooperation with user and administrators at multiple sites and will increase the opportunity for collaboration. As a first step, we will plan to customize one important application each from the three projects and based on that expertise to build on further customization. The aim will be to have more involvement by the scientists and their teams so that they become self-reliant in later customization and new development work. Integration of metadata across sites will also be undertaken for integrating data stored in relational databases.
This research has been sponsored by the National Archives and Records Administration under Award No. ACI-9619020, by thr National Institute of Health under Award No. NIH-8P41RR08605 and by the National Science Foundation under Award No. DEB-99-80154.
Foster, I., and Kesselman, C., (1999) “The Grid: Blueprint
for a New Computing Infrastructure,” Morgan
PPDG, (1999) “The Particle Physics Data Grid”,
Hoschek, W., Jaen-Martinez, J., Samar, A., Stockinger, H., and Stockinger, K. (2000) “Data Management in an International Data Grid Project,” IEEE/ACM International Workshop on Grid Computing
Grid'2000, Bangalore, India 17-20 December 2000.
GriPhyN, (2000) “The Grid Physics Network”,
NEES, (2000) “Network for Earthquake Engineering
Simulation”, ( ).
KNB, (1999) “The Knowledge Network for
Biocomplexity”, ( ).
NVO, (2001) “National Virtual Observatory”,
ASCI, (1999) “Accelarated Strategic Computing
Initiative”, A DOE Project,
IPG, (2000) “Information Power Grid”, A NASA Project,
BIRN, (2001) “Biomedical Information Research
Network”, ( ).
RoadNet, (2001) “Wireless Access and Real-time Data
Management”, ( ).
Moore R., and A. Rajasekar, (2001) “Data and Metadata Collections for Scientific Applications”, High Performance Computing and Networking (HPCN
2001), Amsterdam, NL, June 2001.
Baru, C., R, Moore, A. Rajasekar, M. Wan, (1998) “The SDSC Storage Resource Broker,” Proc. CASCON'98
Conference, Nov.30-Dec.3, 1998, Toronto, Canada.
Baru, C., R. Moore, A. Rajasekar, W. Schroeder, M.Wan, R. Klobuchar, D. Wade, R. Sharpe, J. Terstriep, (1998a) “A Data Handling Architecture for a Prototype Federal Application,” Sixth Goddard Conference on Mass Storage Systems and
Technologies, March, 1998.
SRB 2001, “The Storage Resource Broker Web Page,
MCAT 2000, “The Metadata Catalog”,
Figure 1: BIRN SrbRack with Routers and UPS