Skip to content

User Support banner image

SDSC > User Support > Compute and Data Resource Guides > HPSS User Guide > HPSS User Guide: HTAR Manual

HTAR Manual

Archive Migration Notice

SDSC has migrated from HPSS to the Storage and Archive Manager-Quick File System (SAM-QFS). SDSC staff will be responsible for moving all the data from HPSS to SAM-QFS.

Users are requested NOT to move their own data, as this will significantly delay the migration effort.

More Information

This document is a web version of the UNIX htar manual page, and is provided here for convenience. Please check the online HTAR man page to view the most current documentation.

Last updated: 08/29/2005

htar Command



Purpose

-------

Manipulates HPSS-resident tar-format archives.



Why Use htar?

-------------


htar has been optimized for creation of archive files 

directly in HPSS, without having to go through the 

intermediate step of first creating the archive file on 

local disk storage, and then copying the archive file 

to HPSS via some other process such as ftp or hsi. 

The program uses multiple threads and a sophisticated 

buffering scheme in order to package member files into 

in-memory buffers, while making use of the high-speed 

network striping capabilities of HPSS.  

In most cases, it will be significantly faster to use 

htar to create a tar file in HPSS than to either create 

a local tar file and then copy it to HPSS, or to use tar 

piped into ftp (or hsi) to create the tar file directly 

in HPSS.



In addition, htar creates a separate index file, which 

contains the names and locations of all of the member 

files in the archive (tar) file. Individual files and 

directories in the archive can be randomly retrieved 

without having to read through the archive file. 

Because the index file is usually smaller than the 

archive file, it is possible that the index file may 

reside in HPSS disk cache even though the archive file 

has been moved offline to tape. 

Since htar uses the index file for listing operations, 

it may be possible to list the contents of the archive 

file without having to incur the time delays of reading 

the archive file back onto disk cache from tape.



It is also possible to create an index file for a tar 

file that was not originally created by htar or to recreate 

an index that has been unintentionally deleted.

Back to Top

Syntax

------


htar  -{c|t|x|K|X}  [-?] -f Archive[-B][-d debuglevel] [-E] 
                                       [-L inputlist]

[-F [user@]FTP_server[#port]] [-H opt[:opt...]] [-h] 
                              [-I {IndexFile | .suffix}]

[-M maxfiles] [-m] [-n days] [-o] [-p] [-q] 
                     [-s hpss_server[/port]] [-S Bufsize]

[-T Max Threads] [Filespec | Directory ...] [-v] [-V] [-w] 
         [-Y [Archive COS ID] [:Index File COS ID]]
        



The htar command manipulates HPSS-resident archives, or 

archives that reside on a remote system (subject to the 

restrictions noted below), by writing files to, or retrieving 

files from, either HPSS or a remote FTP server.  

Files written to HPSS are in the POSIX 1003.1 "tar" format, 

and may be retrieved from HPSS (or the remote system), 

and read by native "tar" programs.  



HTAR can be used to manipulate archive files that reside on a 

remote system if the following conditions are met:

1. The HTAR executable must be compiled with this feature 

enabled. If not, then the "-F" option will not be recognized.  

2. The remote system must be running a version of FTPD that 

supports the HPSS parallel transfer protocol.  



The local files used by the htar command are represented by the 

Filespec parameter. If the Filespec parameter refers to a 

directory, then that directory, and, recursively, all files and 

directories within it, are referenced as well.  

Unlike the standard UNIX "tar" command, there is no default 

archive device; the "-f Archive" flag is required.



"Archive" and "Member" files

-----------------------------


Throughout the htar documentation, the term "archive file" is 

used to refer to the tar-format  file, which is named by the 

"-f filename" command line option. The term "member file" is 

used to refer to individual files contained within the archive 

file.

Back to Top

HTAR Index File

----------------

As part of the process of creating an archive file on HPSS, 

htar also creates an index file, which is a directory of the 

files contained in the archive. The Index File includes the 

position of member files within the archive, so that files 

and/or directories can be randomly retrieved from the archive 

without having to read through it sequentially.

The index file is usually significantly smaller in size than 

the archive file, and may often reside in HPSS disk cache even 

though the archive file resides on tape. All htar operations 

make use of an index file.  



It is also possible to create an index file for an archive 

file that was not created by htar, by using the "Build Index" 

[-X] function (see below).



By default, the index filename is created by adding ".idx" as 

a suffix to the Archive name specified by the -f parameter.  

A different suffix or index filename may be specified by the 

"-I " option, as described below.



By default, the Index File is assumed to reside in the same 

directory as the Archive File.  This can be changed by 

specifying a relative or absolute pathname via the -I option. 

The Index file's relative pathname is relative to the Archive 

File directory unless an absolute pathname is specified. 

Back to Top

Use of Absolute Pathnames

-------------------------

Although htar does not restrict the use of absolute pathnames 

(pathnames that begin with a leading "/") for member files 

when the archive is created,  it will remove the leading / 

when files are extracted from the archive. All extracted 

files use pathnames that are relative to the current working 

directory.



However, when using the "verify" action (-K), absolute 

pathnames are used unless the -Hrelpaths ("relative paths") 

option is specifed (see below).



HTAR Consistency File

---------------------

HTAR writes an extra file as the last member file of each 

Archive, with a name similar to:



/usr/tmp/HTAR_CF_CHK_64474_982644481



This file is used to verify the consistency of the Archive 

File and the Index File.  Unless the file is explicitly 

specified, HTAR does not extract this file from the Archive 

when the -x action is selected.  The file is listed, 

however, when the -t action is selected.



Tar File Restrictions

-----------------------

When specifying path names that are greater than 100 

characters for a file (POSIX 1003.1 USTAR) format, remember 

that the path name is composed of a prefix buffer, a / 

(slash), and a name buffer.



The prefix buffer can be a maximum of 155 bytes and the name 

buffer can hold a maximum of 100 bytes. Since some 

implementations of TAR require the prefix and name buffers to 

terminate with a null ('\0') character, htar enforces the 

restriction that the effective prefix buffer length is 154 

characters (+ trailing zero byte), and the name buffer length 

is 99 bytes (+ trailing zero byte). If the path name cannot 

be split into these two parts by a slash, it cannot be 

archived.  This limitation is due to the structure of the 

tar archive headers, and must be maintained for compliance 

with standards and backwards compatibility. In addition, 

the length of a destination for a hard or symbolic link 

( the 'link name') cannot exceed 100 bytes 

(99 characters + zero-byte terminator).

Back to Top

HPSS Default Directories

------------------------



The default directory for the Archive file is the HPSS 

home directory for the DCE user.  An absolute or relative 

HPSS path can optionally be specified for either the 

Archive file or the Index file. By default, the Index file 

is created in the same HPSS directory as the Archive file.



For the "Create" action, if the Archive file pathname 

contains subdirectories that do not already exist, the 

command will fail unless the "-P" option is used.  

This option is analogous to the "-p" option for the Un*x 

"mkdir" command.



Local Temporary Directory

--------------------------



HTAR makes use of the TMPDIR environment variable when 

creating temporary files.  If TMPDIR is not set in the 

environment, then "/tmp" is used.

Back to Top

HTAR Command Options

---------------------

Two groups of flags exist for the htar command; "action" 

flags and "optional" flags. Action flags specify the 

operation to be performed by the htar command, and are 

specified by one of the following:



-c, -t, -x, -X, -K



At least one action flag must be selected in order for 

the htar command to perform any useful function (note: 

in the initial implementation, one and only one action 

can be specified per execution).



Filespec



A file specification has one of the following forms:



WildcardPath

or

Pathname 

or

Filename  



"WildcardPath" is a path specification that includes 

standard filename pattern-matching characters, as 

specified for the shell that is being used to invoke htar.  

The pattern-matching characters are expanded by the shell 

and passed to htar as command line arguments.



Note that using wildcard characters for the -t  and -x 

actions may not work as expected unless there are existing 

local files that match the pattern. For example, 



htar -xf someFile.tar a*

        

will only extract files beginning with "a" in "someFile.tar" 

that also already exist in the current local working directory.

Back to Top

Action Flags

-------------

Action flags defined for htar are as follows:



-c      Creates a new HPSS-resident archive, and writes 

the local files specified by one or more File parameters 

into the archive.  Warning: any preexisting archive file 

will be overwritten without prompting. This behavior 

mimics that of the AIX tar utility.



-t      Lists the member files in the order in which they 

appear in the HPSS- resident archive.   Listable output is 

written to standard output; all other output is written to 

standard error.



-x      Extracts the member files specified by one or more 

File parameters from the archive. If the File parameter 

refers to a directory, the htar command recursively extracts 

that directory and all of its subdirectories from the archive. 



-K      Verifies the contents of the archive, based upon the 

verification level options given by the -Hverify and 

-Hrelpaths options.  

If the File parameter is not specified, htar extracts all of 

the files from the archive. If an archive contains multiple 

copies of the same file, the last copy extracted overwrites 

all previously extracted copies.  If the file being extracted 

does not already exist on the system, it is created. If you 

have the proper permissions, then htar command restores all 

files and directories with the same owner and group IDs as 

they have on the HPSS tar file. If you  do not have the 

proper permissions, then files and directories are restored 

with your owner and group IDs.  



-X      builds a new index file by reading the entire tar file. 

This operation is used either to reconstruct an index for tar 

files whose Index File is unavailable (e.g., accidentally 

deleted), or for tar files that were not originally created by 

htar. 

Back to Top

---------------------------

Options 



-?      Displays htar's verbose help



-B      Displays block numbers as part of the listing 

        (-t option). 

This is normally used only for debugging.



-d debuglevel   Sets debug level (0 - N) for htar. 

0 disables debug, 1 - n enable progressively higher 

levels of debug output. 5 is the highest level; 

anything > 5 is silently mapped to 5.



-E      If present, specifies that a local file should 

be used for the file specified by the "-f Archive" 

option.  If not specified, then the archive file will 

reside in HPSS.  



-F [user@]FTP_server[#port]  Specifies that the archive 

file resides on a remote system that runs a version of 

FTPD which supports the HPSS parallel file transfer 

protocol.  This option is only available if HTAR was 

compiled with this capability enabled.  If not, then 

this option will not be recognized, and will cause a 

command line error to be generated.  



Any optional parts of the parameter following -F must 

not contain any whitespace characters. The remote 

username can be specified by the "user@" prefix. The

username can also be of the form "user@realm@" for remote 

Kerberos realms; anything preceding the rightmost "@" in 

the prefix is assumed to be the username on the remote 

system.  The remote port to which HTAR should connect can 

be specified by the optional "#port" suffix; the default 

FTP port (port 21) will be used if a port is not specified.



-f Archive   Uses "Archive" as the name of archive to be 

read or written. Note: This is a required parameter for 

htar, unlike the standard tar utility, which uses a built-in 

default name.  If the Archive variable specified is - 

(minus sign), the tar command writes to standard output or 

reads from standard input.  If you write to standard output, 

the -I option is mandatory, in order to specify an Index 

File, which is copied to HPSS if the Archive file is 

successfully written to standard output.  [Note: this behavior 

is deferred - reading from or writing to pipes is not supported 

in the initial version of htar].



[-H opt[:opt...]]  Specifies HTAR-specific options.  Multiple 

"-H" parameters may be specified, and multiple colon-separated 

options may be specified for each -H.  Options may be either 

standalone keywords, or may be of the form "opt=value".

The option string must not contain whitespace characters.  



Opt may be any of the following:



nostage  - specifies that HTAR should try to read the 

archive file directly from tape for read operations 

such as -x (extract), rather than having HPSS potentially 

stage the entire file onto disk cache when it is opened.  

This option can be useful when only a small number of files 

are being extracted from a large archive. However, misuse of 

this option can cause HPSS tape drive resource contention, 

and should normally be used only after coordinating with the 

site's HPSS administrators.  

crc - specifies that HTAR should generate CRC checksums when 

creating the archive. For extract (-x), specifying this option 

will cause checksums to be regenerated and verified for files 

that were added to the archive with checksums enabled. For 

build index (-X), this option will cause the archive to be 

read, and a checksum to be added to the index. For list (-t) 

operations, this option will cause the checksum to be listed 

following the object permissions



nocrc - specifies that HTAR should should not generate 

CRC checksums when writing to the archive (-c or -X) or 

regenerate and compare CRCs (-x).



port=port_number - specifies the port that HTAR should 

use when connecting to the HPSS server.  This option has 

meaning only if the -Hserver option is specified (see below).

                

rmlocal  - specifies that HTAR should attempt to remove local 

files on a creation run (-c) after the archive is created and 

any post-transfer verification has completed without errors.  

Only local files that were successfully copied to the archive 

will be removed.  Local directories are not affected by this 

option, only files and symbolic links.  

server=server_host - specifies the HPSS server that HTAR should 

connect to when it starts up.  This is normally unnecessary, 

but can be specified if it is necessary to override the builtin 

setting.  The "-Hport" option can also be specified to override 

the default port to which HTAR connects.  

verify=option[,option...] - specifies one or more verification 

options that should be performed following successful creation 

of the archive (-c), or for the "verify" (-K) command.  

Multiple options can be specified by separating them with a 

comma, with no whitespace. Options are processed from left to 

right, and, in the case of conflicting options, the last one 

encountered is used without comment.



Options are as follows:

        

info        - compares tar header info with the corresponding 

values in the index

crc/nocrc   - enables CRC checking of archive files for which 

a CRC was generated when the file was added to the archive



compare/nocompare - enables/disables a byte-by-byte comparison 

of archive member files and their local file counterparts. 

If -Hrelpaths is not specified, then absolute paths for member 

files in the archive will also be treated as absolute local

paths.



0           - enables "info" verification

1           - enables level 0 + "crc" verification

2           - enables level 1 + "compare" verification

all         - enables all comparison options (currently, 

tar hdr checking, CRC checking, and local file comparisons).



relpaths  - specifies that HTAR should use relative paths 

instead of absolute paths when comparing the archive and 

local member files.  This option was intended to provide a 

way to compare files with absolute paths on an archive 

with member file(s) that were created with relative paths 

by a previous "extract" (-x) action.  



-h      Forces the tar command to follow symbolic links as 

if they were normal files or directories. Normally, the tar 

command does not follow symbolic links.



-I index_name   Specifies the index file name or suffix.  

If the first character of the index_name is a period, 

then index_name is appended to the Archive name, e.g. 

"-f the_htar -I .xndx" would create an index file called 

"the_htar.xndx".  If the first character is not a period, 

then index_name is treated as a relative pathname for the 

index file (relative to the Archive file directory) if the 

pathname does not start with "/", or an absolute pathname 

otherwise.



The default directory for the Index file is the same as for 

the Archive file.  If a relative Index file pathname is 

specifed, then it is appended to the directory path for 

the Archive file.  For example, if the Archive file resides 

in HPSS in the directory "projects/prj/" and is called 

files.tar, then an Index file specification of 

"-I projects/prj/files.old.idx" would fail, because htar 

would look for the file in the directory 

"projects/prj/projects/prj".  The correct specification 

in this case is "-I files.old.idx".



-L InputList    Writes the files and directories listed in 

the "InputList" file to the archive. Directories named in 

the InputList file are treated recursively (Note: this was 

not the case for earlier versions of HTAR). Note that 

"home directory" notation ("~") is not expanded for pathnames 

contained in the InputList file, nor are wildcard characters, 

such as "*" and "?".



-M maxfiles      Sets the maximum number of member files that 

can be contained in the archive when it is initially created. 

The default maximum number of member files, and an absolute 

maximum number of files, are defined when HTAR is built. 

No limit will be enforced if: 

- The default maximum number of files was set to a negative 

value when HTAR was built, and the -M option is NOT specified, 

or - A value less than 0 is specified for the -M option, and 

the absolute maximum number of files was also set to a negative 

value when HTAR was built.  

If the value specified for the -M option exceeds the absolute 

maximum value that was defined when HTAR was built, HTAR will 

issue a warning message, and use the absolute maximum value.



-m      Uses the time of extraction as the modification time. 

The default is to preserve the modification time of the files. 

Note that the modification time of directories is not 

guaranteed to be preserved, since the operating system may 

change the timestamp as the directory contents are changed by 

extracting other files and/or directories.  htar will 

explicitly set the timestamp on directories that it extracts 

from the Archive, but not on intermediate directories that are 

created during the process of extracting files. 



-n time   Meaningful only for <create> (-c) action.  If 

specified, only files that have been created or modified 

within the specified time will be included in the archive.  

This option is intended to simplify the creation of incremental 

backups. "time" is specified in one of the following forms:

days

:hours

days:hours



-O      If specified, files that are extracted will be 

written to standard output.  This is normally only useful 

when extracting a single file, as there is nothing in the 

output stream to mark the end of file.



-o      Provides backwards compatibility with older 

versions (non-AIX) of the tar command. When this flag is 

used for reading, it causes the extracted file to take on 

the User and Group ID (UID and GID) of the user running 

the program, rather than those on the archive.  This is 

the default behavior for the ordinary user. If htar is 

being run as root, use of this option causes files to be 

owned by root rather than the original user.



-O      If specified, files that are extracted using 

the -x option will be written to standard output.  

This is normally only useful when extracting a single 

file, as there is nothing in the output stream to mark 

the end of file.



-p      Says to restore fields to their original modes, 

ignoring the present umask. The setuid, setgid, and 

tacky bit permissions are also restored to the user with 

root user authority.


-P      This option is only meaningful for the "create" 

action.  It causes intermediate subdirectories for the 

Archive file pathname to be created if they do not already 

exist.  NOTE: This option is currently implemented for 

HPSS-resident and local-file-resident (-E option) archives.  

It has not yet been implemented for FTP-resident archives

(-F option); for FTP-resident archives the option is 

accepted, but ignored.  



-q      "quiet mode" flag.  If this option is specified, 

htar will not display extraneous messages, such as the 

interactive progress messages as it scans directories during 

a "create" operation.  



-S bufsize      Specifies the buffer size to use when 

reading or writing the HPSS tar file.  The buffer size 

can be specified as a value, or as kilobytes by appending 

any of  "k","K","kb", or "KB" to the value.  It can also be 

specified as megabytes by appending any of  "m" or "M" or 

"mb" or "MB" to the value, for example, 23mb.  



-T Max Threads      Specifies the maximum number of threads 

to use when copying local member files to the Archive file.  

The default is defined when htar is built; the release value 

is 15.  The maximum number of threads actually used is 

dependent upon the local file sizes, and the size of the I/O 

buffers.  A good approximation is usually buffer size/average 

file size If the -v or -V option is specified, then the maximum 

number of local file threads  used while writing the Archive 

file to HPSS is displayed when the transfer is complete.



-V      "Slightly verbose" mode. If selected, file transfer 

progress will be displayed in interactive mode. This option 

should normally not be selected if verbose (-v) mode is 

enabled, as the outputs for the two different options are 

generated by separate threads, and may be intermixed on the 

output.



-v      "Verbose" mode. For each file processed, displays a 

one-character operation flag, and lists the name of each file. 

The flag values displayed are:

"a"  - file was added to the archive 

"x"  - file was extracted from the archive

"i"  - index file entry was created (Build Index operation)



-w      Displays the action to be taken, followed by the 

file name, and then waits for user confirmation. If the 

response is affirmative, the action is performed. If the 

response is not affirmative, the file is ignored.



-Y auto | [Archive CosID][:IndexCosID] Specifies the HPSS 

Class of Service ID to use when creating a new Archive and/or 

Index file.  If the keyword "auto" is specified, then the 

HPSS "hints" mechanism is used to select the archive COS, 

based upon file size.  If "-Y cosID"  is specified, then 

"cosID" is the numeric COS ID to be used for the Archive File.  

If  "-Y :IndexCosID" is specified, then "IndexCosID" is the 

numeric COS ID to be  used for the Index File. The default 

COS ID (or "auto") is a site-specific option that is defined 

when HTAR is built. If both COS IDs are specified, the entire 

parameter must be specified as a single string with no 

embedded spaces, e.g. "-Y 40:30". This option may also be 

specified by the "HTAR_COS" environment variable. The 

environment variable is overridden by the -Y command line 

option, if both are used.



HTAR Memory Restrictions

-------------------------

When writing to an HPSS archive, the tar command uses a 

temporary file (normally in /tmp) and maintains in memory 

a table of files; you receive an error message if htar cannot 

create the temporary file, or if there is not enough memory 

available to hold the internal tables.



Authentication

---------------

HTAR uses Kerberos authentication in order to grant access to 

HPSS.  For most LLNL systems on which htar is supported, you 

obtain the necessary credentials automatically when you login 

to the system.  If Kerberos credentials are not available when 

HTAR is started, it will run the kinit program to obtain them, 

and these credentials may be used for subsequent HTAR 

invocations, until such time as they expire.



Supported Platforms at LLNL

----------------------------

HTAR is supported on all LC production platforms, and can be 

found at /usr/local/bin/htar.

</create>

Back to Top

HTAR Execution Environment

---------------------------

At LLNL, HTAR is actually a wrapper script, which sets the

proper environment variables and then execs the htar executable.



HTAR makes use of the following HPSS environment variables,

if they are available:



HTAR_COS - set to the default COS ID for the archive file,

or the string "auto" to force automatic COS selection based

upon file size hints.  This environment variable is overridden

by the -Y command line option.



HPSS_SERVER_HOST - contains the server hostname and optional

  port number of the HTAR server.



HPSS_HOSTNAME - contains the hostname or IP address of the

  network interface to which HPSS mover(s) should connect 

  when transferring data.  This is overridden by the file

  specified in the PFTP_CONFIG_FILENAME environment variable.

  The default interface is the one specified by the "hostname" 

  command.  Note that this is often a slow interface, such as

  the control ethernet on an IBM SP2.



HPSS_PATH_ETC - pathname of a local directory containing

  the HPSS network options file



PFTP_CONFIG_FILENAME - pathname of a file containing the list

  of HPSS network interfaces to be used 



HTAR also references the following non-HPSS environment 

variables:



TMPDIR - used when creating temporary files 

HOME   - used when searching for the network options file 

         (normally only used by HPSS system administrators).

Back to Top

Notes: 

-------

1. The maximum size of a single Member file within the Archive 

is approximately 8 GB, due to restrictions in the format of the

tar header.  HTAR does not impose any restriction on the size

of the Archive File when it is written to HPSS; however, space

quotas or other system restrictions may limit the size of the 

Archive File when it is written to a local file (-E option).



2.  HTAR will optionally write to a local file; however, it 

will not write to any file type except "regular files".  In 

particular, it is not suitable for writing to magnetic tape.  

To write to a magnetic tape device, use the "tar" or "cpio" 

utility.


Exit Status



This command returns the following exit values:



0       Successful completion.



>0      An error occurred.

Back to Top

Examples



1.      To write the file1 and file2 files to a new archive 

called "files.tar" in the current HPSS home directory, enter:



htar -cf files.tar file1 file2



2.      To write the file1 and file2 files to a new archive 

called "files.tar" on a remote FTP server called 

"blue.pacific.llnl.gov", creating the tar file in the user's 

remote FTP home directory, enter:


htar -cf files.tar -F blue.pacific.llnl.gov file1 file2



2.      To extract all files from the project1/src directory 

in the Archive file called proj1.tar, and use the time of 

extraction as the modification time,  enter:



htar -xm -f proj1.tar project1/src



3.      To display the names of the files in the out.tar 

archive file within the HPSS home directory, enter:



htar -vtf out.tar



Files



/usr/local/bin/htar       Specifies the name of the htar 
                          wrapper script.



/usr/local/bin/htar.exe   Contains the htar executable.



/tmp/tar*       Specifies a temporary file.

Back to Top

Related Information



For file archivers: the cat command, dd command, pax command.

For HPSS file transfer programs: pftp, nft, hsi





File Systems Overview for System Management in AIX Version 4 

System Management Guide: Operating System and Devices explains 

file system types, management, structure, and maintenance.



Directory Overview in AIX Version 4 Files Reference explains 

working with directories and path names.



Files Overview in AIX Version 4 System User's Guide: Operating 

System and Devices provides information on working with files.

Back to Top

Bugs and Limitations: 

------------

- There is no way to specify relative Index file pathnames 

that are not rooted in the Archive file directory without 

specifying an absolute path.



- HTAR does not provide the ability to append, update or 

remove files.

Back to Top


Did You Get
What You
Wanted?
Yes No
Comments