HTAR
Submitted by jeromel on Thu, 2005-12-22 12:36
Under:
To use htar within the HPSS environment, users are required
to have the valid Kerberos credentials.
The following is the man page of how to use htar.
NAME
htar - HPSS tar utility
PURPOSE
Manipulates HPSS-resident tar-format archives.
SYNOPSIS
htar -{c|t|x|X} -f Archive [-?] [-B] [-E] [-L inputlist] [-h] [-m] [-o]
[-d debuglevel] [-p] [-v] [-V] [-w]
[-I {IndexFile | .suffix}] [-Y [Archive COS ID][:Index File COS ID]]
[-S Bufsize] [-T Max Threads] [Filespec | Directory ...]
DESCRIPTION
htar is a utility which manipulates HPSS-resident archives
by writing files to, or retrieving files from the High
Performance Storage System (HPSS). Files written to HPSS
are in the POSIX 1003.1 "tar" format, and may be retrieved
from HPSS, or read by native tar programs.
For those unfamiliar with HPSS, an introduction can be found
on the web at
http://www.sdsc.edu/hpss
The local files used by the htar command are represented by
the Filespec parameter. If the Filespec parameter refers to
a directory, then that directory, and, recursively, all
files and directories within it, are referenced as well.
Unlike the standard Unix "tar" command, there is no default
archive device; the "-f Archive" flag is required.
Archive and Member files
Throughout the htar documentation, the term "archive file"
is used to refer to the tar-format file, which is named by
the "-f filename" command line option. The term "member
file" is used to refer to individual files contained within
the archive file.
WHY USE HTAR
htar has been optimized for creation of archive files
directly in HPSS, without having to go through the
intermediate step of first creating the archive file on
local disk storage, and then copying the archive file to
HPSS via some other process such as ftp or hsi. The program
uses multiple threads and a sophisticated buffering scheme
in order to package member files into in-memory buffers,
while making use of the high-speed network striping
capabilities of HPSS.
In most cases, it will be signficantly faster to use htar
to create a tar file in HPSS than to either create a local
tar file and then copy it to HPSS, or to use tar piped into
ftp (or hsi) to create the tar file directly in HPSS.
In addition, htar creates a separate index file, (see next
section) which contains the names and locations of all of
the member files in the archive (tar) file. Individual
files and directories in the archive can be randomly
retrieved without having to read through the archive file.
Because the index file is usually smaller than the archive
file, it is possible that the index file may reside in HPSS
disk cache even though the archive file has been moved
offline to tape; since htar uses the index file for listing
operations, it may be possible to list the contents of the
archive file without having to incur the time delays of
reading the archive file back onto disk cache from tape.
It is also possible to create an index file for a tar file
that was not originally created by htar.
HTAR Index File
As part of the process of creating an archive file on HPSS,
htar also creates an index file, which is a directory of the
files contained in the archive. The Index File includes the
position of member files within the archive, so that files
and/or directories can be randomly retrieved from the
archive without having to read through it sequentially. The
index file is usually significantly smaller in size than the
archive file, and may often reside in HPSS disk cache even
though the archive file resides on tape. All htar operations
make use of an index file.
It is also possible to create an index file for an archive
file that was not created by htar, by using the "Build
Index" [-X] function (see below).
By default, the index filename is created by adding ".idx"
as a suffix to the Archive name specified by the -f
parameter. A different suffix or index filename may be
specified by the "-I " option, as described below.
By default, the Index File is assumed to reside in the same
directory as the Archive File. This can be changed by
specifying a relative or absolute pathname via the -I
option. The Index file's relative pathname is relative to
the Archive File directory unless an absolute pathname is
specified.
HTAR Consistency File
HTAR writes an extra file as the last member file of each
Archive, with a name similar to:
/tmp/HTAR_CF_CHK_64474_982644481
This file is used to verify the consistency of the Archive
File and the Index File. Unless the file is explicitly
specified, HTAR does not extract this file from the Archive
when the -x action is selected. The file is listed,
however, when the -t action is selected.
Tar File Restrictions
When specifying path names that are greater than 100
characters for a file (POSIX 1003.1 USTAR) format, remember
that the path name is composed of a prefix bufferFR, a /
(slash), and a name buffer.
The prefix buffer can be a maximum of 155 bytes and the name
buffer can hold a maximum of 100 bytes. Since some
implementations of TAR require the prefix and name buffers
to terminate with a null (' ') character, htar enforces the
restriction that the effective prefix buffer length is 154
characters (+ trailing zero byte), and the name buffer
length is 99 bytes (+ trailing zero byte). If the path name
cannot be split into these two parts by a slash, it cannot
be archived. This limitation is due to the structure of the
tar archive headers, and must be maintained for compliance
with standards and backwards compatibility. In addition, the
length of a destination for a hard or symbolic link ( the
'link name') cannot exceed 100 bytes (99 characters + zero-
byte terminator).
HPSS Default Directories
The default directory for the Archive file is the HPSS home
directory for the DCE user. An absolute or relative HPSS
path can optionally be specified for either the Archive file
or the Index file. By default, the Index file is created in
the same HPSS directory as the Archive file.
Use of Absolute Pathnames
Although htar does not restrict the use of absolute
pathnames (pathnames that begin with a leading "/") when the
archive is created, it will remove the leading / when files
are extracted from the archive. All extracted files use
pathnames that are relative to the current working
directory.
HTAR USAGE
Two groups of flags exist for the htar command; "action"
flags and "optional" flags. Action flags specify the
operation to be performed by the htar command, and are
specified by one of the following:
-c, -t, -x, -X
One action flag must be selected in order for the htar
command to perform any useful function.
File specification (Filespec)
A file specification has one of the following forms:
WildcardPath
or
Pathname
or
Filename
WildcardPath is a path specification that includes standard
filename pattern-matching characters, as specified for the
shell that is being used to invoke htar. The pattern-
matching characters are expanded by the shell and passed to
htar as command line arguments.
Action Flags
Action flags defined for htar are as follows:
-c Creates a new HPSS-resident archive, and writes the
local files specified by one or more File parameters
into the archive. Warning: any pre-existing archive file
will be overwritten without prompting. This behavior
mimics that of the AIX tar utility.
-t Lists the files in the order in which they appear in
the HPSS- resident archive. Listable output is
written to standard output; all other output is written
to standard error.
-x Extracts the files specified by one or more File
parameters from the HPSS-resident archive. If the File
parameter refers to a directory, the htar command
recursively extracts that directory and all of its
subdirectories from the archive.
If the File parameter is not specified, htar extracts
all of the files from the archive. If an archive
contains multiple copies of the same file, the last
copy extracted overwrites all previously extracted
copies. If the file being extracted does not already
exist on the system, it is created. If you have the
proper permissions, then htar command restores all
files and directories with the same owner and group IDs
as they have on the HPSS tar file. If you do not have
the proper permissions, then files and directories are
restored with your owner and group IDs.
-X builds a new index file by reading the entire tar file.
This operation is used either to reconstruct an index
for tar files whose Index File is unavailable (e.g.,
accidentally deleted), or for tar files that were not
originally created by htar.
Options
-? Displays htar's verbose help
-B Displays block numbers as part of the listing (-t
option). This is normally used only for debugging.
-d debuglevel
Sets debug level (0 - N) for htar. 0 disables debug, 1
- n enable progressively higher levels of debug output.
5 is the highest level; anything > 5 is silently mapped
to 5. 0 is the default debug level.
-E If present, specifies that a local file should be used
for the file specified by the "-f Archive" option. If
not specified, then the archive file will reside in
HPSS.
-f Archive
Uses Archive as the name of archive to be read or
written. Note: This is a required parameter for htar,
unlike the standard tar utility, which uses a built-in
default name.
If the Archive variable specified is - (minus sign),
the tar command writes to standard output or reads from
standard input. If you write to standard output, the -I
option is mandatory, in order to specify an Index File,
which is copied to HPSS if the Archive file is
successfully written to standard output. [Note: this
behavior is deferred - reading from or writing to pipes
is not supported in the initial version of htar].
-h Forces the htar command to follow symbolic links as if
they were normal files or directories. Normally, the
tar command does not follow symbolic links.
-I index_name
Specifies the index file name or suffix. If the first
character of the index_name is a period, then
index_name is appended to the Archive name, e.g. "-f
the_htar -I .xdnx" would create an index file called
"the_htar.xndx". If the first character is not a
period, then index_name is treated as a relative
pathname for the index file (relative to the Archive
file directory) if the pathname does not start with
"/", or an absolute pathname otherwise.
The default directory for the Index file is the same as
for the Archive file. If a relative Index file
pathname is specifed, then it is appended to the
directory path for the Archive file. For example, if
the Archive file resides in HPSS in the directory
"projects/prj/files.tar", then an Index file
specification of "-I projects/prj/files.old.idx" would
fail, because htar would look for the file in the
directory "projects/prj/projects/prj". The correct
specification in this case is "-I files.old.idx".
-L InputList
Writes the files and directories listed in the
"InputList" file to the archive. Directories named in
the InputList file are not treated recursively. For
directory names contained in the InputList file, the
tar command writes only the directory entry to the
archive, not the files and subdirectories rooted in the
directory. Note that "home directory" notation ("~")
is not expanded for pathnames contained in the
InputList file, nor are wildcard characters, such as
"*" and "?".
-m Uses the time of extraction as the modification time.
The default is to preserve the modification time of the
files. Note that the modification time of directories
is not guaranteed to be preserved, since the operating
system may change the timestamp as the directory
contents are changed by extracting other files and/or
directories. htar will explicitly set the timestamp on
directories that it extracts from the Archive, but not
on intermediate directories that are created during the
process of extracting files.
-o Provides backwards compatibility with older versions
(non-AIX) of the tar command. When this flag is used
for reading, it causes the extracted file to take on
the User and Group ID (UID and GID) of the user running
the program, rather than those on the archive. This is
the default behavior for the ordinary user. If htar is
being run as root, use of this option causes files to
be owned by root rather than the original user.
-p Says to restore fields to their original modes,
ignoring the present umask. The setuid, setgid, and
tacky bit permissions are also restored to the user
with root user authority.
-S bufsize
Specifies the buffer size to use when reading or
writing the HPSS tar file. The buffer size can be
specified as a value, or as kilobytes by appending any
of "k","K","kb", or "KB" to the value. It can also be
specified as megabytes by appending any of "m" or "M"
or "mb" or "MB" to the value, for example, 23mb.
-T max_threads
Specifies the maximum number of threads to use when
copying local member files to the Archive file. The
default is defined when htar is built; the release
value is 20. The maximum number of threads actually
used is dependent upon the local file sizes, and the
size of the I/O buffers. A good approximation is
usually
buffer size/average file size
If the -v or -V option is specified, then the maximum
number of local file threads used while writing the
Archive file to HPSS is displayed when the transfer is
complete.
-V "Slightly verbose" mode. If selected, file transfer
progress will be displayed in interactive mode. This
option should normally not be selected if verbose (-v)
mode is enabled, as the outputs for the two different
options are generated by separate threads, and may be
intermixed on the output.
-v "Verbose" mode. For each file processed, displays a
one-character operation flag, and lists the name of
each file. The flag values displayed are:
"a" - file was added to the archive
"x" - file was extracted from the archive
"i" - index file entry was created (Build Index
operation)
-w Displays the action to be taken, followed by the file
name, and then waits for user confirmation. If the
response is affirmative, the action is performed. If
the response is not affirmative, the file is ignored.
-Y auto | [Archive CosID][:IndexCosID]
Specifies the HPSS Class of Service ID to use when
creating a new Archive and/or Index file. If the
keyword auto is specified, then the HPSS hints
mechanism is used to select the archive COS, based upon
the file size. If -Y cosID is specified, then cosID
is the numeric COS ID to be used for the Archive File.
If -Y :IndexCosID is specified, then IndexCosID is the
numeric COS ID to be used for the Index File. If both
COS IDs are specified, the entire parameter must be
specified as a single string with no embedded spaces,
e.g. "-Y 40:30".
HTAR Memory Restrictions
When writing to an HPSS archive, the htar command uses a
temporary file (normally in /tmp) and maintains in memory a
table of files; you receive an error message if htar cannot
create the temporary file, or if there is not enough memory
available to hold the internal tables.
HTAR Environment
HTAR should be compiled and run within a non-DCE HPSS environment.
Miscellaneous Notes:
1. The maximum size of a single Member file within the
Archive is approximately 8 GB, due to restrictions in the
format of the tar header. HTAR does not impose any
restriction on the total size of the Archive File when it is
written to HPSS; however, space quotas or other system
restrictions may limit the size of the Archive File when it
is written to a local file (-E option).
2. HTAR will optionally write to a local file; however, it
will not write to any file type except "regular files". In
particular, it is not suitable for writing to magnetic tape.
To write to a magnetic tape device, use the "tar" or "cpio"
utility.
Exit Status
This command returns the following exit values:
0 Successful completion.
>0 An error occurred.
Examples
1. To write the file1 and file2 files to a new archive
called "files.tar" in the current HPSS home directory,
enter:
htar -cf files.tar file1 file2
2. To extract all files from the project1/src directory in
the Archive file called proj1.tar, and use the time of
extraction as the modification time, enter:
htar -xm -f proj1.tar project1/src
3. To display the names of the files in the out.tar
archive file within the HPSS home directory, enter:
htar -tvf out.tar
Related Information
For file archivers: the cat command, dd command, pax
command. For HPSS file transfer programs: pftp, nft, hsi
File Systems Overview for System Management in AIX Version 4
System Management Guide: Operating System and Devices
explains file system types, management, structure, and
maintenance.
Directory Overview in AIX Version 4 Files Reference explains
working with directories and path names.
Files Overview in AIX Version 4 System User's Guide:
Operating System and Devices provides information on working
with files.
HPSS web site at http://www.sdsc.edu/hpss
Bugs and Limitations:
- There is no way to specify relative Index file pathnames
that are not rooted in the Archive file directory without
specifying an absolute path.
- The initial implementation of HTAR does not provide the
ability to append, update or remove files. These features,
and others, are planned enhancements for future versions.
