Sget

From SRB

Contents

NAME

Sget - exports one or more objects from SRB space into the local file system

SYNOPSIS

Sget [-n n] [-N numThreads] [-pbfrvsmMVX] [-T ticketFile | -t ticket] [-A condition] [-W versionString] [-R retry_count] [-x restartFile] [-k] srbObj|Collection ... [localFile|localDirectory]

Sget [-aChl] [-c condition] localFile | localDirectory

DESCRIPTION

Sget reads one or more Objects and|or Collections from SRB space and writes them to the user's local file system. If any srbObj is replicated then only one of the copies is read. With -A option, only srbObj which conform to the condition are chosen.

The second synopsis applies the condition and copies the resultant list of objects.

The srbObj and|or Collection argument can be a path name in the SRB collection hierarchy and can also contain patterns with '*' and '?' symbols as wildcards. The user should have at least 'read' permission for each item being copied.

If the source object is a single dataset then it is copied to the localFile. If the source consists of multiple objects (datasets and collections), then the target should be a directory in the local file system. '.' can be used to denote current working directory. If at least one of the source objects is a Collection, the -r option must be used.

The bulk unload option (-b) can be used to greatly improve the effi- ciency of recursively downloading a large number of small files stored in a collection. The source files can be regular SRB files and/or files stored in containers. The -r option is implied and does not need to be specified if the -b option is used.

By default, Sget uses the serial I/O API to do the download. The -m option or setting the environment variable "srbParallel" to any value can be used to set the transfer mode to the "server initiated connection" parallel I/O mode. In this mode, the server is the active partner. The client listens passively on the control socket for instructions from the servers. Upon receiving the client's initial request, the server with the help of informations from the MCAT, plans the execution of the data download. Typically it sends data transfer instructions to the server where the export resource is located. The resource server then subdivides the file to be exported into segments and spawns threads to handle the export of each segment in parallel. One advantage of this scheme is data transfer is always directly between the resource server and client with no intermediate server in between.

A drawback of this mode is the client could be sitting behind a firewall and a server outside the firewall may not be able to connect to this client control soc ket.

The -M option which sets the transfer mode to the "client initiated connection" parallel I/O mode, was designed to get around the client's firewall issue. This mode is very similar to the "server initiated" mode except for the initial handshakes. In this mode, the control socket is on the resource server and the client spawns multiple threads, each initiating connection to the server control socket. After the connections have been established, the data transfer mechanism is the same as the "server initiated" mode.

Comparing with the "server initiated" mode, the overhead of the "client initiated" mode is slightly higher because of the more complicated initial handshakes. But for large file transfer, the difference is practi- cally nothing.

In addition, with the -M option set, the [-N numthreads] can be used to suggest to the server on the number of threads to use for the parallel transfer. If this option is not used, the server will decide the number of threads based on an internal algorithm.

The -m/M and -b options are mutually exclusive. -b is designed to process a large number of small files by concatinating many small files together and sending it to the client at once. It does not use parallel I/O. The -m/M option specifies parallel I/O which is designed to transfer large files. So, if you have one or more large files, the -m or -M options should be used. If you have several small files, the -b option should be used. If both -m/M and -b options are specified, then -b takes precedent.


The -x option specifies that the bulk get operation can be restarted in case the operation terminated prematurely and the "restartFile" specifies the local file path for the restart information. If the specified restartFile does not exist or is empty, the bulk get operation is assumed to start from the beginning. Otherwise, the bulk get operation will restart based on the information stored in the restartFile. Note that the restartFile will not be deleted even the operation completed successfully.

OPTIONS

-h
display command options
-p
prompts before reading each object from SRB space.
-b
use bulk unload to recursively download the source SRB collection.
-n
n is an integer denoting the replica number of the object to be
copied.
-f
force copying even if the file exists.
-r
copy SRB collections recursively to the local file system.
-W
selects the copy with the given version.
-v
verbose mode. print out file size and transfer rate. If the
transfer mode is parallel, the output contains one additional item -
the number of threads used.
-V
verbose progress mode. print out progress status and ETA for
sequential transfers. Does nothing for parallel transfers. It
forces '-v' switch on.
-k
checksum mode. Retrieves simple checksum (sum -s, --sysv) from the
MCAT and compares it with the checksum of the local file just
downloaded.
-R
number of retries retry mode. Retries on any Sget error. The max
retry count has to be specified. Sleeps 2,4,8,16,32,... seconds
between retries.
-s
force I/O mode to serial (default).
-m
force I/O mode to parallel. Setting the environment variable
"srb-Parallel" to any value achieves the same result.
-M
set I/O mode to "client initiated" parallel I/O mode.
-N
numThreads The number of threads to use for the parallel transfer.
Only valid with the -M option.
-T
option to give a filename containing a ticket
-t
option for giving a ticket directly


-X
Download using the Master MCAT. The default uses the Slave
MCAT (unless the "masterMcat" environment variable is set)
if configured.
-x
restartFile Specifies the restart File Path for the bulk get operation.


-A
condition option is an '&' separated
condition which will be applied in choosing the object to
be accessed. The separated condition is of the form "<Attr>
<CompOp> <Value>", where <Attr> is an MCAT attribute found
by the Sattrs command, <CompOp> is a comparison operator
and <Value> is a numeric or string value. The entire
condition should be within a set of double quotes. Example: Sget
-A "COPY = 0" foo
-c
condition option applies to the whole SRB system instead of
relative to the current working collection. Example: Sget
-c "GUID='123'" will copy all objects that have their Guid
set to 123. See Sattrs for details of applicable
conditions.

SEE ALSO

Sappend, Sput, Srsync, Scat