SAM-QFS User Guide: Data Allocations
Introduction
SAM-QFS at SDSC is a high-performance archival storage system for data collections. This system provides a complete data management system consisting of two integrated software products: SAM (Storage Archive Manager) and QFS (a high-performance, 64-bit, Solaris, SAN-based file system). Using SAM-QFS, users with data allocations may access data directly using a disk cache filesystem; the data then automatically migrates to tape.
There is no charge for allocated users using SAM-QFS and the limit to the number of files you may store is determined by your allocation. For information on obtaining a data (storage-only) allocation, see the following sections of the SDSC Data Central site:
- How to Apply for a Data Allocation at SDSC (See Data Collections on Disk)
- Are You Eligible?
For special projects requiring long-term disk cache residency, contact SDSC Consulting.
System Configuration
SAM-QFS has 16 tape drives, an 304-TB disk cache, and the following system configuration :
- Meta Data Servers
- 2 x Sun x4600
- Data Login Servers
- 2 x Sun v40z
- 6 x 1GbE connections to the TeraGrid/HPC network
- Tape Drives
- 16 STK 9940-B tape drives
- 24 IBM J2 3592 tape drives
For a quick comparison of policies and recommended use between HPSS and SAM-QFS, please see Storage at SDSC: Quick Comparison.
It is your responsibility to back up critical data. This storage system is very reliable; however, data can be lost or damaged due to media failures, system software bugs, hardware failures, and user mistakes. Because of the enormous amount of data involved, SDSC maintains only one copy of SAM-QFS data on tape. For dual-copy capabilities or offsite backups, send a request to datacentral-allocations@sdsc.edu.
Logging In to the Data-login Node
Access the host via ssh. The command to log in as user is:
% ssh user@data-login.sdsc.edu
Once you are logged in, you can find your files or data through the UNIX file system. Disk and/or tape allocations are located under /archive/science on the data-login node in a directory that will be named based on your username, project name, or project code. Store all data files in this directory.
Access From SRB
SDSC's Storage Resource Broker (SRB) is a client-server middleware that provides a uniform interface for connecting to heterogeneous data resources over a network and accessing replicated data sets. SRB, in conjunction with the Metadata Catalog (MCAT), provides a way to access data sets and resources based on their attributes and/or logical names rather than their names or physical locations. For more detailed information about SRB, visit the SRB User Guide.
To begin using SRB:
- Obtain a SRB account.
- For any non-SDSC machines, download and install the latest version of SRB.
- From UNIX, check your environment file for the
proper settings:
% cat ~/.srb/.MdasEnv
- Your .MdasEnv file should look similar to:
mdasCollectionHome <home_collection_name>
mdasDomainHome <MCAT_user_home_domain_name>
srbUser <MCAT_username>
srbHost archive.sdsc.edu
AUTH_SCHEME <PASSWD_AUTH | ENCRYPT1 | SEA_AUTH | GSI_AUTH>
SERVER_DN <server_user_distinguished_name> (for GSI authentication only) - Check to see that you have a password:
% cat ~/.srb/.MdasAuth
For each SRB session:
- Initiate a session:
% Sinit
- Use Scommands such as Scp, Sget, Sput, etc. to transfer files.
- Transfer using the host name archive.sdsc.edu.
- For SAM-QFS only, you may also use Sstage to stage data.
Transfering files
SCP
For data files up to 2 GB, scp is recommended for transferring files on the data-login node. For example, to copy test_file.tst to your (user) subdirectory named /archive/science/mydir:
% scp test_file.tst user@data-login.sdsc.edu:/archive/science/mydir
BBFTP
For files greater than 2 GB, bbftp is recommended for transferring files on the data-login node. bbftp is similar to ftp. For example, to copy test_file.tst to your (user) /archive/science/mydir subdirectory, the command is:
% bbftp -s -e 'put test_file.tst /archive/science/mydir/test_file.tst' -u user -p 10 -V data-login.sdsc.edu
- Linux/Unix: download bbftp from http://doc.in2p3.fr/bbftp/download.html. Installation instructions are found at http://doc.in2p3.fr/bbftp/doc.3.2.0.html.
- Windows: To use bbftp from Windows, Cygwin must be installed on your machine. Download the Cygwin compiled client bbftp-client-cygwin-3.2.0.zip.
TGCP
For high performance transfers, tgcp is recommended for transferring files on the data-login node. tgcp is similar to scp. For example, to copy test_file.tst to your (user) /archive/science/mydir subdirectory, the command is:
% tgcp test_file.tst archive.sdsc.edu/archive/science/mydir
Transferring data through SRB
Upload/Download with checksum verification
The following command recursively uploads a local directory testdir. The server then computes the checksum by reading back the files just uploaded and verifies them with the checksum values computed locally.
% Sput -Kvr testdir
Download SRB files and verify the checksum value by comparing the registered value with the checksum of the local file just downloaded.
% Sget -kvr testdir testdir1


It is your
responsibility to back up critical data. This storage system is very
reliable; however, data can be lost or damaged due to media failures, system
software bugs, hardware failures, and user mistakes. Because of the enormous
amount of data involved, SDSC maintains only one copy of SAM-QFS data on tape.
For dual-copy capabilities or offsite backups, send a request to

