(Other Tars): New node describing how to extract
GNU-specific member formats using third-party tars.
This commit is contained in:
375
doc/tar.texi
375
doc/tar.texi
@@ -109,8 +109,8 @@ Appendices
|
||||
|
||||
* Changes::
|
||||
* Configuring Help Summary::
|
||||
* Genfile::
|
||||
* Tar Internals::
|
||||
* Genfile::
|
||||
* Free Software Needs Free Documentation::
|
||||
* Copying This Manual::
|
||||
* Index of Command Line Options::
|
||||
@@ -330,11 +330,18 @@ Making @command{tar} Archives More Portable
|
||||
* posix:: @acronym{POSIX} archives
|
||||
* Checksumming:: Checksumming Problems
|
||||
* Large or Negative Values:: Large files, negative time stamps, etc.
|
||||
* Other Tars:: How to Extract GNU-Specific Data Using
|
||||
Other @command{tar} Implementations
|
||||
|
||||
@GNUTAR{} and @acronym{POSIX} @command{tar}
|
||||
|
||||
* PAX keywords:: Controlling Extended Header Keywords.
|
||||
|
||||
How to Extract GNU-Specific Data Using Other @command{tar} Implementations
|
||||
|
||||
* Split Recovery:: Members Split Between Volumes
|
||||
* Sparse Recovery:: Sparse Members
|
||||
|
||||
Using Less Space through Compression
|
||||
|
||||
* gzip:: Creating and Reading Compressed Archives
|
||||
@@ -369,12 +376,6 @@ Using Multiple Tapes
|
||||
* Tarcat:: Concatenate Volumes into a Single Archive
|
||||
|
||||
|
||||
Genfile
|
||||
|
||||
* Generate Mode:: File Generation Mode.
|
||||
* Status Mode:: File Status Mode.
|
||||
* Exec Mode:: Synchronous Execution mode.
|
||||
|
||||
Tar Internals
|
||||
|
||||
* Standard:: Basic Tar Format
|
||||
@@ -389,6 +390,12 @@ Storing Sparse Files
|
||||
* PAX 0:: PAX Format, Versions 0.0 and 0.1
|
||||
* PAX 1:: PAX Format, Version 1.0
|
||||
|
||||
Genfile
|
||||
|
||||
* Generate Mode:: File Generation Mode.
|
||||
* Status Mode:: File Status Mode.
|
||||
* Exec Mode:: Synchronous Execution mode.
|
||||
|
||||
Copying This Manual
|
||||
|
||||
* GNU Free Documentation License:: License for copying this manual
|
||||
@@ -7753,6 +7760,8 @@ archives and archive labels) in GNU and PAX formats.}
|
||||
* posix:: @acronym{POSIX} archives
|
||||
* Checksumming:: Checksumming Problems
|
||||
* Large or Negative Values:: Large files, negative time stamps, etc.
|
||||
* Other Tars:: How to Extract GNU-Specific Data Using
|
||||
Other @command{tar} Implementations
|
||||
@end menu
|
||||
|
||||
@node Portable Names
|
||||
@@ -8070,6 +8079,350 @@ be extracted by any tar implementation that understands older
|
||||
@FIXME{Describe how @acronym{POSIX} archives are extracted by non
|
||||
POSIX-aware tars.}
|
||||
|
||||
@node Other Tars
|
||||
@subsection How to Extract GNU-Specific Data Using Other @command{tar} Implementations
|
||||
|
||||
In previous sections you became acquainted with various quircks
|
||||
necessary to make your archives portable. Sometimes you may need to
|
||||
extract archives containing GNU-specific members using some
|
||||
third-party @command{tar} implementation or an older version of
|
||||
@GNUTAR{}. Of course your best bet is to have @GNUTAR{} installed,
|
||||
but if it is for some reason impossible, this section will explain
|
||||
how to cope without it.
|
||||
|
||||
When we speak about @dfn{GNU-specific} members we mean two classes of
|
||||
them: members split between the volumes of a multi-volume archive and
|
||||
sparse members. You will be able to always recover such members if
|
||||
the archive is in PAX format. In addition split members can be
|
||||
recovered from archives in old GNU format. The following subsections
|
||||
describe the required procedures in detail.
|
||||
|
||||
@menu
|
||||
* Split Recovery:: Members Split Between Volumes
|
||||
* Sparse Recovery:: Sparse Members
|
||||
@end menu
|
||||
|
||||
@node Split Recovery
|
||||
@subsubsection Extracting Members Split Between Volumes
|
||||
|
||||
If a member is split between several volumes of an old GNU format archive
|
||||
most third party @command{tar} implementation will fail to extract
|
||||
it. To extract it, use @command{tarcat} program (@pxref{Tarcat}).
|
||||
This program is available from
|
||||
@uref{http://www.gnu.org/@/software/@/tar/@/utils/@/tarcat, @GNUTAR{}
|
||||
home page}. It concatenates several archive volumes into a single
|
||||
valid archive. For example, if you have three volumes named from
|
||||
@file{vol-1.tar} to @file{vol-2.tar}, you can do the following to
|
||||
extract them using a third-party @command{tar}:
|
||||
|
||||
@smallexample
|
||||
$ @kbd{tarcat vol-1.tar vol-2.tar vol-3.tar | tar xf -}
|
||||
@end smallexample
|
||||
|
||||
You could use this approach for many (although not all) PAX
|
||||
format archives as well. However, extracting split members from a PAX
|
||||
archive is a much easier task, because PAX volumes are constructed in
|
||||
such a way that each part of a split member is extracted as a
|
||||
different file by @command{tar} implementations that are not aware of
|
||||
GNU extensions. More specifically, the very first part retains its
|
||||
original name, and all subsequent parts are named using the pattern:
|
||||
|
||||
@smallexample
|
||||
%d/GNUFileParts.%p/%f.%n
|
||||
@end smallexample
|
||||
|
||||
@noindent
|
||||
where symbols preceeded by @samp{%} are @dfn{macro characters} that
|
||||
have the following meaning:
|
||||
|
||||
@multitable @columnfractions .25 .55
|
||||
@headitem Meta-character @tab Replaced By
|
||||
@item %d @tab The directory name of the file, equivalent to the
|
||||
result of the @command{dirname} utility on its full name.
|
||||
@item %f @tab The file name of the file, equivalent to the result
|
||||
of the @command{basename} utility on its full name.
|
||||
@item %p @tab The process ID of the @command{tar} process that
|
||||
created the archive.
|
||||
@item %n @tab Ordinal number of this particular part.
|
||||
@end multitable
|
||||
|
||||
For example, if, a file @file{var/longfile} was split during archive
|
||||
creation between three volumes, and the creator @command{tar} process
|
||||
had process ID @samp{27962}, then the member names will be:
|
||||
|
||||
@smallexample
|
||||
var/longfile
|
||||
var/GNUFileParts.27962/longfile.1
|
||||
var/GNUFileParts.27962/longfile.2
|
||||
@end smallexample
|
||||
|
||||
When you extract your archive using a third-party @command{tar}, these
|
||||
files will be created on your disk, and the only thing you will need
|
||||
to do to restore your file in its original form is concatenate them in
|
||||
the proper order, for example:
|
||||
|
||||
@smallexample
|
||||
@group
|
||||
$ @kbd{cd var}
|
||||
$ @kbd{cat GNUFileParts.27962/longfile.1 \
|
||||
GNUFileParts.27962/longfile.2 >> longfile}
|
||||
$ rm -f GNUFileParts.27962
|
||||
@end group
|
||||
@end smallexample
|
||||
|
||||
Notice, that if the @command{tar} implementation you use supports PAX
|
||||
format archives, it will probably emit warnings about unknown keywords
|
||||
during extraction. They will lool like this:
|
||||
|
||||
@smallexample
|
||||
@group
|
||||
Tar file too small
|
||||
Unknown extended header keyword 'GNU.volume.filename' ignored.
|
||||
Unknown extended header keyword 'GNU.volume.size' ignored.
|
||||
Unknown extended header keyword 'GNU.volume.offset' ignored.
|
||||
@end group
|
||||
@end smallexample
|
||||
|
||||
@noindent
|
||||
You can safely ignore these warnings.
|
||||
|
||||
If your @command{tar} implementation is not PAX-aware, you will get
|
||||
more warnigns and more files generated on your disk, e.g.:
|
||||
|
||||
@smallexample
|
||||
@group
|
||||
$ @kbd{tar xf vol-1.tar}
|
||||
var/PaxHeaders.27962/longfile: Unknown file type 'x', extracted as
|
||||
normal file
|
||||
Unexpected EOF in archive
|
||||
$ @kbd{tar xf vol-2.tar}
|
||||
tmp/GlobalHead.27962.1: Unknown file type 'g', extracted as normal file
|
||||
GNUFileParts.27962/PaxHeaders.27962/sparsefile.1: Unknown file type
|
||||
'x', extracted as normal file
|
||||
@end group
|
||||
@end smallexample
|
||||
|
||||
Ignore these warnings. The @file{PaxHeaders.*} directories created
|
||||
will contain files with @dfn{extended header keywords} describing the
|
||||
extracted files. You can delete them, unless they describe sparse
|
||||
members. Read further to learn more about them.
|
||||
|
||||
@node Sparse Recovery
|
||||
@subsubsection Extracting Sparse Members
|
||||
|
||||
Any @command{tar} implementation will be able to extract sparse members from a
|
||||
PAX archive. However, the extracted files will be @dfn{condensed},
|
||||
i.e. any zero blocks will be removed from them. When we restore such
|
||||
a condensed file to its original form, by adding zero bloks (or
|
||||
@dfn{holes}) back to their original locations, we call this process
|
||||
@dfn{expanding} a compressed sparse file.
|
||||
|
||||
To expand a file, you will need a simple auxiliary program called
|
||||
@command{xsparse}. It is available in source form from
|
||||
@uref{http://www.gnu.org/@/software/@/tar/@/utils/@/xsparse, @GNUTAR{}
|
||||
home page}.
|
||||
|
||||
Let's begin with archive members in @dfn{sparse format
|
||||
version 1.0}@footnote{@xref{PAX 1}.}, which are the easiest to expand.
|
||||
The condensed file will contain both file map and file data, so no
|
||||
additional data will be needed to restore it. If the original file
|
||||
name was @file{@var{dir}/@var{name}}, then the condensed file will be
|
||||
named @file{@var{dir}/@/GNUSparseFile.@var{n}/@/@var{name}}, where
|
||||
@var{n} is a decimal number@footnote{technically speaking, @var{n} is a
|
||||
@dfn{process ID} of the @command{tar} process which created the
|
||||
archive (@pxref{PAX keywords}).}.
|
||||
|
||||
To expand a version 1.0 file, run @command{xsparse} as follows:
|
||||
|
||||
@smallexample
|
||||
$ @kbd{xsparse @file{cond-file}}
|
||||
@end smallexample
|
||||
|
||||
@noindent
|
||||
where @file{cond-file} is the name of the condensed file. The utility
|
||||
will deduce the name for the resulting expanded file using the
|
||||
following algorithm:
|
||||
|
||||
@enumerate 1
|
||||
@item If @file{cond-file} does not contain any directories,
|
||||
@file{../cond-file} will be used;
|
||||
|
||||
@item If @file{cond-file} has the form
|
||||
@file{@var{dir}/@var{t}/@var{name}}, where both @var{t} and @var{name}
|
||||
are simple names, with no @samp{/} characters in them, the output file
|
||||
name will be @file{@var{dir}/@var{name}}.
|
||||
|
||||
@item Otherwise, if @file{cond-file} has the form
|
||||
@file{@var{dir}/@var{name}}, the output file name will be
|
||||
@file{@var{name}}.
|
||||
@end enumerate
|
||||
|
||||
In the unlikely case when this algorithm does not suite your needs,
|
||||
you can explicitely specify output file name as a second argument to
|
||||
the command:
|
||||
|
||||
@smallexample
|
||||
$ @kbd{xsparse @file{cond-file}}
|
||||
@end smallexample
|
||||
|
||||
It is often a good idea to run @command{xsparse} in @dfn{dry run} mode
|
||||
first. In this mode, the command does not actually expand the file,
|
||||
but verbosely lists all actions it would be taking to do so. The dry
|
||||
run mode is enabled by @option{-n} command line argument:
|
||||
|
||||
@smallexample
|
||||
@group
|
||||
$ @kbd{xsparse -n /home/gray/GNUSparseFile.6058/sparsefile}
|
||||
Reading v.1.0 sparse map
|
||||
Expanding file `/home/gray/GNUSparseFile.6058/sparsefile' to
|
||||
`/home/gray/sparsefile'
|
||||
Finished dry run
|
||||
@end group
|
||||
@end smallexample
|
||||
|
||||
To actually expand the file, you would run:
|
||||
|
||||
@smallexample
|
||||
$ @kbd{xsparse /home/gray/GNUSparseFile.6058/sparsefile}
|
||||
@end smallexample
|
||||
|
||||
@noindent
|
||||
The program behaves the same way all UNIX utilities do: it will keep
|
||||
quiet unless it has simething important to tell you (e.g. an error
|
||||
condition or something). If you wish it to produce verbose output,
|
||||
similar to that from the dry run mode, give it @option{-v} option:
|
||||
|
||||
@smallexample
|
||||
@group
|
||||
$ @kbd{xsparse -v /home/gray/GNUSparseFile.6058/sparsefile}
|
||||
Reading v.1.0 sparse map
|
||||
Expanding file `/home/gray/GNUSparseFile.6058/sparsefile' to
|
||||
`/home/gray/sparsefile'
|
||||
Done
|
||||
@end group
|
||||
@end smallexample
|
||||
|
||||
Additionally, if your @command{tar} implementation has extracted the
|
||||
@dfn{extended headers} for this file, you can instruct @command{xstar}
|
||||
to use them in order to verify the integrity of the expanded file.
|
||||
The option @option{-x} sets the name of the extended header file to
|
||||
use. Continuing our example:
|
||||
|
||||
@smallexample
|
||||
@group
|
||||
$ @kbd{xsparse -v -x /home/gray/PaxHeaders.6058/sparsefile \
|
||||
/home/gray/GNUSparseFile.6058/sparsefile}
|
||||
Reading extended header file
|
||||
Found variable GNU.sparse.major = 1
|
||||
Found variable GNU.sparse.minor = 0
|
||||
Found variable GNU.sparse.name = sparsefile
|
||||
Found variable GNU.sparse.realsize = 217481216
|
||||
Reading v.1.0 sparse map
|
||||
Expanding file `/home/gray/GNUSparseFile.6058/sparsefile' to
|
||||
`/home/gray/sparsefile'
|
||||
Done
|
||||
@end group
|
||||
@end smallexample
|
||||
|
||||
An @dfn{extended header} is a special @command{tar} archive header
|
||||
that precedes an archive member and contains a set of
|
||||
@dfn{variables}, describing the member properties that cannot be
|
||||
stored in the standard @code{ustar} header. While optional for
|
||||
expanding sparse version 1.0 members, use of extended headers is
|
||||
mandatory when expanding sparse members in older sparse formats: v.0.0
|
||||
and v.0.1 (The sparse formats are described in detail in @pxref{Sparse
|
||||
Formats}). So, for this format, the question is: how to obtain
|
||||
extended headers from the archive?
|
||||
|
||||
If you use a @command{tar} implementation that does not support PAX
|
||||
format, extended headers for each member will be extracted as a
|
||||
separate file. If we represent the member name as
|
||||
@file{@var{dir}/@var{name}}, then the extended header file will be
|
||||
named @file{@var{dir}/@/PaxHeaders.@var{n}/@/@var{name}}, where
|
||||
@var{n} is an integer number.
|
||||
|
||||
Things become more difficult if your @command{tar} implementation
|
||||
does support PAX headers, because in this case you will have to
|
||||
manually extract the headers. We recommend the following algorithm:
|
||||
|
||||
@enumerate 1
|
||||
@item
|
||||
Consult the documentation for your @command{tar} implementation for an
|
||||
option that will print @dfn{block numbers} along with the archive
|
||||
listing (analogous to @GNUTAR{}'s @option{-R} option). For example,
|
||||
@command{star} has @option{-block-number}.
|
||||
|
||||
@item
|
||||
Obtain the verbose listing using the @samp{block number} option, and
|
||||
find the position of the sparse member in question and the member
|
||||
immediately following it. For example, running @command{star} on our
|
||||
archive we obtain:
|
||||
|
||||
@smallexample
|
||||
@group
|
||||
$ @kbd{star -t -v -block-number -f arc.tar}
|
||||
@dots{}
|
||||
star: Unknown extended header keyword 'GNU.sparse.size' ignored.
|
||||
star: Unknown extended header keyword 'GNU.sparse.numblocks' ignored.
|
||||
star: Unknown extended header keyword 'GNU.sparse.name' ignored.
|
||||
star: Unknown extended header keyword 'GNU.sparse.map' ignored.
|
||||
block 56: 425984 -rw-r--r-- gray/users Jun 25 14:46 2006 GNUSparseFile.28124/sparsefile
|
||||
block 897: 65391 -rw-r--r-- gray/users Jun 24 20:06 2006 README
|
||||
@dots{}
|
||||
@end group
|
||||
@end smallexample
|
||||
|
||||
@noindent
|
||||
(as usual, ignore the warnings about unknown keywords.)
|
||||
|
||||
@item
|
||||
Let the size of the sparse member be @var{size}, its block number be
|
||||
@var{Bs} and the block number of the next member be @var{Bn}.
|
||||
Compute:
|
||||
|
||||
@smallexample
|
||||
@var{N} = @var{Bs} - @var{Bn} - @var{size}/512 - 2
|
||||
@end smallexample
|
||||
|
||||
@noindent
|
||||
This number gives the size of the extended header part in tar @dfn{blocks}.
|
||||
In our example, this formula gives: @code{897 - 56 - 425984 / 512 - 2
|
||||
= 7}.
|
||||
|
||||
@item
|
||||
Use @command{dd} to extract the headers:
|
||||
|
||||
@smallexample
|
||||
@kbd{dd if=@var{archive} of=@var{hname} bs=512 skip=@var{Bs} count=@var{N}}
|
||||
@end smallexample
|
||||
|
||||
@noindent
|
||||
where @var{archive} is the archive name, @var{hname} is a name of the
|
||||
file to store the extended header in, @var{Bs} and @var{N} are
|
||||
computed in previous steps.
|
||||
|
||||
In our example, this command will be
|
||||
|
||||
@smallexample
|
||||
$ @kbd{dd if=arc.tar of=xhdr bs=512 skip=56 count=7}
|
||||
@end smallexample
|
||||
@end enumerate
|
||||
|
||||
Finally, you can expand the condensed file, using the obtained header:
|
||||
|
||||
@smallexample
|
||||
@group
|
||||
$ @kbd{xsparse -v -x xhdr GNUSparseFile.6058/sparsefile}
|
||||
Reading extended header file
|
||||
Found variable GNU.sparse.size = 217481216
|
||||
Found variable GNU.sparse.numblocks = 208
|
||||
Found variable GNU.sparse.name = sparsefile
|
||||
Found variable GNU.sparse.map = 0,2048,1050624,2048,@dots{}
|
||||
Expanding file `GNUSparseFile.28124/sparsefile' to `sparsefile'
|
||||
Done
|
||||
@end group
|
||||
@end smallexample
|
||||
|
||||
@node Compression
|
||||
@section Using Less Space through Compression
|
||||
|
||||
@@ -10432,14 +10785,14 @@ output. Default is 12.
|
||||
Right margin of the text output. Used for wrapping.
|
||||
@end deftypevr
|
||||
|
||||
@node Genfile
|
||||
@appendix Genfile
|
||||
@include genfile.texi
|
||||
|
||||
@node Tar Internals
|
||||
@appendix Tar Internals
|
||||
@include intern.texi
|
||||
|
||||
@node Genfile
|
||||
@appendix Genfile
|
||||
@include genfile.texi
|
||||
|
||||
@node Free Software Needs Free Documentation
|
||||
@appendix Free Software Needs Free Documentation
|
||||
@include freemanuals.texi
|
||||
|
||||
Reference in New Issue
Block a user