Sput

From SRB

Contents

NAME

Sput - imports one or more local files and/or directories into SRB space.

SYNOPSIS

Sput [-fprabvsmMkKV] [-c container] [-D dataType] [-n replNum] [-N numThreads] [-S resourceName] [-P pathName] [-R retry_count] [-x restartFile] localFileName|localDirectory ... TargetName

Sput -i [-a] [-c container] [-D dataType] [-S resourceName] [-P pathName] [-R,--retry retry_count] [-k] [-K] TargetName

DESCRIPTION

Sput reads one or more local files and|or directories and writes them as object(s) in SRB space. Each new SRB object is also registered in the MCAT.

The second synopsis reads data from standard input and writes it as an object in SRB space.

The localFileName and|or localDirectory can be a path name in the local file hierarchy. The user should have at least 'read' access permission for them. These inputs can have wildcards.

The TargetName can be a path name in the collection hierarchy. The object creation is done in the current collection, if TargetName is just an object-name. If a relative or absolute collection is given in TargetName, then the object is stored in that collection. The user should have write access permission for the collection. '.' can be used as a TargetName to denote the current collection.

If TargetName is a collection, then Sput uses the names of the local files as SRB object names. The directory path of localFileName is not used in making the SRB object name.

If TargetName is an object-name (possibly with a collection path) and there are more than one local file to be copied, then the TargetName is appended to the front of the local file names to make SRB object names.

If given, the data type of the object is set to dataType. Otherwise, Sput uses the file extender (i.e. string after the last '.') to figure out the type. If it is unable to figure out the data type, the 'generic' data type is used. Most popular extenders in Unix and DOS will be supported. The data type discovery is done in MCAT and hence future extenders can be supported without any change in or compilation of the client code.

If given, the object is stored in resourceName. Otherwise the object is stored in the default resource given by DEFRESOURCE in the user environment file, located in ~/.srb/.MdasEnv

If given, the -P option and the input pathName specifies the physical path where the target will be stored instead of using the default vault path associated with the resource. If the source is a file, pathName specifies the full physical path of the target. If the source is a directory, pathName specifies the base directory of the target where the entire directory structure of the source will be replicated. Currently, the SRB server will only create new directories and subdirectories when the input pathName is within the vault path associated with the resource.


By default, Sput uses the serial I/O API to do the put. The -m option or setting the environment variable "srbParallel" to any value can be used to set the transfer mode to the "server initiated connection" parallel I/O mode. In this mode, the server is the active partner. The client listens passively on the control socket for instructions from the servers. Upon receiving the client's initial request, the server with the help of informations from the MCAT, plans the execution of the data upload. Typically it sends data transfer instructions to the server where the import resource is located. The resource server then subdivides the file to be imported into segments and spawns threads to handle the import of each segment in parallel. One advantage of this scheme is data transfer is always directly between the resource server and client with no intermediate server in between.

A drawback of this mode is that if there is a firewall between the client and the server, this method will fail.

The -M option which sets the transfer mode to the "client initiated connection" parallel I/O mode, was designed to get around the client's firewall issue. This mode is very similar to the "server initiated" mode except for the initial handshakes. In this mode, the control socket is on the resource server and the client spawns multiple threads, each initiating connection to the server control socket. After the connections have been established, the data transfer mechanism is the same as the "server initiated" mode.

Comparing with the "server initiated" mode, the overhead of the "client initiated" mode is slightly higher because of the more complicated initial handshakes. But for large file transfer, the difference is practically nothing.

In addition, with the -M option set, the [-N numthreads] can be used to suggest to the server on the number of threads to use for the parallel transfer. If this option is not used, the server will decide the number of threads based on an internal algorithm.

The -b option is for bulk loading of a large number of small files. It behaves like the -r option except runs a lot faster. Bulk loading of files into container is not supported in this command. Instead, the Sbload command should be used.

The -m/M and -b options are mutually exclusive. -b is designed to process a large number of small files by concatenating many small files together and sending it to the server at once. It does not use parallel I/O. The -m/M option specifies parallel I/O which is designed to transfer large files. So, if you have one or more large files, the -m or -M options should be used. If you have several small files, the -b option should be used. If both -m/M and -b options are specified, then -b takes precedent.


The -x option specifies that the bulk put operation can be restarted in case the operation terminated prematurely and the "restartFile" specifies the local file path for the restart information. If the specified restartFile does not exist or is empty, the bulk put operation is assumed to start from the beginning. Otherwise, the bulk put operation will restart based on the information stored in the restartFile. Note that the restartFile will not be deleted even the operation completed successfully.

OPTIONS

-h
display command options
-p
prompts before writing each object to SRB space.
-f
force copying even if object exists, which it over-writes. Many of the metadata options such as -T, -P, -S, -c , etc., are ignored.
-a
If the target object does not exist and the target resource is a logical resource consisting of more than one resources, then make a copy in each resources. If the target object exists, force copying all replica of the target object.
-b
bulk loading directories recursively to SRB space.
-r
copy local directories recursively to SRB space.
-v
verbose mode. print out file size and transfer rate. If the transfer mode is parallel, the output contains one additional item - the number of threads used.
-V
verbose progress mode. print out progress status and ETA for sequential transfers. Does nothing for parallel transfers. It forces '-v' switch on.
-k
client checksum mode. Client computes simple checksum (sum -s, --sysv) of the local file and registers with MCAT. No verification is done on the server side.
-K
checksum verification mode. After the transfer, the server computes the checksum by reading back the file that was just stored. This value is then compared with the source checksum value provided by the client for verification. This verified checksum value is then registered with MCAT.
-R
number of retries retry mode. Retries on any Sput error. The max retry count has to be specified. Sleeps 2,4,8,16,32,... seconds between retries.
-s
switch I/O mode to serial (default).
-m
set I/O mode to "server initiated" parallel I/O . Setting the environment variable "srbParallel" to any value achieves the same result.
-M
set I/O mode to "client initiated" parallel I/O mode.
-N
numThreads The number of threads to use for the parallel transfer. Only valid with the -M option.
-c
container is the container where the target object will be stored.
-i
read data from standard input.
-S
resourceName is the target resourceName.
-n
replNum Overwrite only the replica with replica number replNum. This option is only valid if TargetName is a file and not a collection.
-P
pathname Specifies the physical path where the target will be stored instead of using the default vault path associated with the resource.
-x restartFile Specifies the restart File Path for the bulk put operation.

SEE ALSO

Sappend, Sget, Srsync, Stoken, SgetR, Smkdir