tar: improve documentation of reliability and security issues
* doc/tar.texi (Reliability and security, Reliability): (Permissions problems, Data corruption and repair, Race conditions): (Security, Privacy, Integrity, Live untrusted data): (Security rules of thumb): New nodes.
This commit is contained in:
276
doc/tar.texi
276
doc/tar.texi
@@ -107,6 +107,7 @@ document. The rest of the menu lists all the lower level nodes.
|
||||
* Date input formats::
|
||||
* Formats::
|
||||
* Media::
|
||||
* Reliability and security::
|
||||
|
||||
Appendices
|
||||
|
||||
@@ -8556,6 +8557,9 @@ For example:
|
||||
$ @kbd{tar -c -f archive.tar -C / home}
|
||||
@end smallexample
|
||||
|
||||
@xref{Integrity}, for some of the security-related implications
|
||||
of using this option.
|
||||
|
||||
@include getdate.texi
|
||||
|
||||
@node Formats
|
||||
@@ -9337,6 +9341,9 @@ and use @option{--dereference} (@option{-h}): many systems do not support
|
||||
symbolic links, and moreover, your distribution might be unusable if
|
||||
it contains unresolved symbolic links.
|
||||
|
||||
The @option{--dereference} option is not secure if an untrusted user
|
||||
can modify files during creation or extraction. @xref{Security}.
|
||||
|
||||
@node hard links
|
||||
@subsection Hard Links
|
||||
@cindex File names, using hard links
|
||||
@@ -11721,6 +11728,275 @@ disabled) switch, a notch which can be popped out or covered, a ring
|
||||
which can be removed from the center of a tape reel, or some other
|
||||
changeable feature.
|
||||
|
||||
@node Reliability and security
|
||||
@chapter Reliability and Security
|
||||
|
||||
The @command{tar} command reads and writes files as any other
|
||||
application does, and is subject to the usual caveats about
|
||||
reliability and security. This section contains some commonsense
|
||||
advice on the topic.
|
||||
|
||||
@menu
|
||||
* Reliability::
|
||||
* Security::
|
||||
@end menu
|
||||
|
||||
@node Reliability
|
||||
@section Reliability
|
||||
|
||||
Ideally, when @command{tar} is creating an archive, it reads from a
|
||||
file system that is not being modified, and encounters no errors or
|
||||
inconsistencies while reading and writing. If this is the case, the
|
||||
archive should faithfully reflect what was read. Similarly, when
|
||||
extracting from an archive, ideally @command{tar} ideally encounters
|
||||
no errors and the extracted files faithfully reflect what was in the
|
||||
archive.
|
||||
|
||||
However, when reading or writing real-world file systems, several
|
||||
things can go wrong; these include permissions problems, corruption of
|
||||
data, and race conditions.
|
||||
|
||||
@menu
|
||||
* Permissions problems::
|
||||
* Data corruption and repair::
|
||||
* Race conditions::
|
||||
@end menu
|
||||
|
||||
@node Permissions problems
|
||||
@subsection Permissions Problems
|
||||
|
||||
If @command{tar} encounters errors while reading or writing files, it
|
||||
normally reports an error and exits with nonzero status. The work it
|
||||
does may therefore be incomplete. For example, when creating an
|
||||
archive, if @command{tar} cannot read a file then it cannot copy the
|
||||
file into the archive.
|
||||
|
||||
@node Data corruption and repair
|
||||
@subsection Data Corruption and Repair
|
||||
|
||||
If an archive becomes corrupted by an I/O error, this may corrupt the
|
||||
data in an extracted file. Worse, it may corrupt the file's metadata,
|
||||
which may cause later parts of the archive to become misinterpreted.
|
||||
An tar-format archive contains a checksum that most likely will detect
|
||||
errors in the metadata, but it will not detect errors in the data.
|
||||
|
||||
If data corruption is a concern, you can compute and check your own
|
||||
checksums of an archive by using other programs, such as
|
||||
@command{cksum}.
|
||||
|
||||
When attempting to recover from a read error or data corruption in an
|
||||
archive, you may need to skip past the questionable data and read the
|
||||
rest of the archive. This requires some expertise in the archive
|
||||
format and in other software tools.
|
||||
|
||||
@node Race conditions
|
||||
@subsection Race conditions
|
||||
|
||||
If some other process is modifying the file system while @command{tar}
|
||||
is reading or writing files, the result may well be inconsistent due
|
||||
to race conditions. For example, if another process creates some
|
||||
files in a directory while @command{tar} is creating an archive
|
||||
containing the directory's files, @command{tar} may see some of the
|
||||
files but not others, or it may see a file that is in the process of
|
||||
being created. The resulting archive may not be a snapshot of the
|
||||
file system at any point in time. If an application such as a
|
||||
database system depends on an accurate snapshot, restoring from the
|
||||
@command{tar} archive of a live file system may therefore break that
|
||||
consistency and may break the application. The simplest way to avoid
|
||||
the consistency issues is to avoid making other changes to the file
|
||||
system while tar is reading it or writing it.
|
||||
|
||||
When creating an archive, several options are available to avoid race
|
||||
conditions. Some hosts have a way of snapshotting a file system, or
|
||||
of temporarily suspending all changes to a file system, by (say)
|
||||
suspending the only virtual machine that can modify a file system; if
|
||||
you use these facilities and have @command{tar -c} read from a
|
||||
snapshot when creating an archive, you can avoid inconsistency
|
||||
problems. More drastically, before starting @command{tar} you could
|
||||
suspend or shut down all processes other than @command{tar} that have
|
||||
access to the file system, or you could unmount the file system and
|
||||
then mount it read-only.
|
||||
|
||||
When extracting from an archive, one approach to avoid race conditions
|
||||
is to create a directory that no other process can write to, and
|
||||
extract into that.
|
||||
|
||||
@node Security
|
||||
@section Security
|
||||
|
||||
In some cases @command{tar} may be used in an adversarial situation,
|
||||
where an untrusted user is attempting to gain information about or
|
||||
modify otherwise-inaccessible files. Dealing with untrusted data
|
||||
(that is, data generated by an untrusted user) typically requires
|
||||
extra care, because even the smallest mistake in the use of
|
||||
@command{tar} is more likely to be exploited by an adversary than by a
|
||||
race condition.
|
||||
|
||||
@menu
|
||||
* Privacy::
|
||||
* Integrity::
|
||||
* Live untrusted data::
|
||||
* Security rules of thumb::
|
||||
@end menu
|
||||
|
||||
@node Privacy
|
||||
@subsection Privacy
|
||||
|
||||
Standard privacy concerns apply when using @command{tar}. For
|
||||
example, suppose you are archiving your home directory into a file
|
||||
@file{/archive/myhome.tar}. Any secret information in your home
|
||||
directory, such as your SSH secret keys, are copied faithfully into
|
||||
the archive. Therefore, if your home directory contains any file that
|
||||
should not be read by some other user, the archive itself should be
|
||||
not be readable by that user. And even if the archive's data are
|
||||
inaccessible to untrusted users, its metadata (such as size or
|
||||
last-modified date) may reveal some information about your home
|
||||
directory; if the metadata are intended to be private, the archive's
|
||||
parent directory should also be inaccessible to untrusted users.
|
||||
|
||||
One precaution is to create @file{/archive} so that it is not
|
||||
accessible to any user, unless that user also has permission to access
|
||||
all the files in your home directory.
|
||||
|
||||
Similarly, when extracting from an archive, take care that the
|
||||
permissions of the extracted files are not more generous than what you
|
||||
want. Even if the archive itself is readable only to you, files
|
||||
extracted from it have their own permissions that may differ.
|
||||
|
||||
@node Integrity
|
||||
@subsection Integrity
|
||||
|
||||
When creating archives, take care that they are not writable by a
|
||||
untrusted user; otherwise, that user could modify the archive, and
|
||||
when you later extract from the archive you will get incorrect data.
|
||||
|
||||
When @command{tar} extracts from an archive, by default it writes into
|
||||
files relative to the working directory. If the archive was generated
|
||||
by an untrusted user, that user therefore can write into any file
|
||||
under the working directory. If the working directory contains a
|
||||
symbolic link to another directory, the untrusted user can also write
|
||||
into any file under the referenced directory. When extracting from an
|
||||
untrusted archive, it is therefore good practice to create an empty
|
||||
directory and run @command{tar} in that directory.
|
||||
|
||||
When extracting from two or more untrusted archives, each one should
|
||||
be extracted independently, into different empty directories.
|
||||
Otherwise, the first archive could create a symbolic link into an area
|
||||
outside the working directory, and the second one could follow the
|
||||
link and overwrite data that is not under the working directory. For
|
||||
example, when restoring from a series of incremental dumps, the
|
||||
archives should have been created by a trusted process, as otherwise
|
||||
the incremental restores might alter data outside the working
|
||||
directory.
|
||||
|
||||
If you use the @option{--absolute-names} (@option{-P}) option when
|
||||
extracting, @command{tar} respects any file names in the archive, even
|
||||
file names that begin with @file{/} or contain @file{..}. As this
|
||||
lets the archive overwrite any file in your system that you can write,
|
||||
the @option{--absolute-names} (@option{-P}) option should be used only
|
||||
for trusted archives.
|
||||
|
||||
Conversely, with the @option{--keep-old-files} (@option{-k}) option,
|
||||
@command{tar} refuses to replace existing files when extracting; and
|
||||
with the @option{--no-overwrite-dir} option, @command{tar} refuses to
|
||||
replace the permissions or ownership of already-existing directories.
|
||||
These options may help when extracting from untrusted archives.
|
||||
|
||||
@node Live untrusted data
|
||||
@subsection Dealing with Live Untrusted Data
|
||||
|
||||
Extra care is required when creating from or extracting into a file
|
||||
system that is accessible to untrusted users. For example, superusers
|
||||
who invoke @command{tar} must be wary about its actions being hijacked
|
||||
by an adversary who is reading or writing the file system at the same
|
||||
time that @command{tar} is operating.
|
||||
|
||||
When creating an archive from a live file system, @command{tar} is
|
||||
vulnerable to denial-of-service attacks. For example, an adversarial
|
||||
user could create the illusion of an indefinitely-deep directory
|
||||
hierarchy @file{d/e/f/g/...} by creating directories one step ahead of
|
||||
@command{tar}, or the illusion of an indefinitely-long file by
|
||||
creating a sparse file but arranging for blocks to be allocated just
|
||||
before @command{tar} reads them. There is no easy way for
|
||||
@command{tar} to distinguish these scenarios from legitimate uses, so
|
||||
you may need to monitor @command{tar}, just as you'd need to monitor
|
||||
any other system service, to detect such attacks.
|
||||
|
||||
While a superuser is extracting from an archive into a live file
|
||||
system, an untrusted user might replace a directory with a symbolic
|
||||
link, in hopes that @command{tar} will follow the symbolic link and
|
||||
extract data into files that the untrusted user does not have access
|
||||
to. Even if the archive was generated by the superuser, it may
|
||||
contain a file such as @file{d/etc/passwd} that the untrusted user
|
||||
earlier created in order to break in; if the untrusted user replaces
|
||||
the directory @file{d/etc} with a symbolic link to @file{/etc} while
|
||||
@command{tar} is running, @command{tar} will overwrite
|
||||
@file{/etc/passwd}. This attack can be prevented by extracting into a
|
||||
directory that is inaccessible to untrusted users.
|
||||
|
||||
Similar attacks via symbolic links are also possible when creating an
|
||||
archive, if the untrusted user can modify an ancestor of a top-level
|
||||
argument of @command{tar}. For example, an untrusted user that can
|
||||
modify @file{/home/eve} can hijack a running instance of @samp{tar -cf
|
||||
- /home/eve/Documents/yesterday} by replacing
|
||||
@file{/home/eve/Documents} with a symbolic link to some other
|
||||
location. Attacks like these can be prevented by making sure that
|
||||
untrusted users cannot modify any files that are top-level arguments
|
||||
to @command{tar}, or any ancestor directories of these files.
|
||||
|
||||
@node Security rules of thumb
|
||||
@subsection Security Rules of Thumb
|
||||
|
||||
This section briefly summarizes rules of thumb for avoiding security
|
||||
pitfalls.
|
||||
|
||||
@itemize @bullet
|
||||
|
||||
@item
|
||||
Protect archives at least as much as you protect any of the files
|
||||
being archived.
|
||||
|
||||
@item
|
||||
Extract from an untrusted archive only into an otherwise-empty
|
||||
directory. This directory and its parent should be accessible only to
|
||||
trusted users. For example:
|
||||
|
||||
@example
|
||||
@group
|
||||
$ @kbd{chmod go-rwx .}
|
||||
$ @kbd{mkdir -m go-rwx dir}
|
||||
$ @kbd{cd dir}
|
||||
$ @kbd{tar -xvf /archives/got-it-off-the-net.tar.gz}
|
||||
@end group
|
||||
@end example
|
||||
|
||||
As a corollary, do not do an incremental restore from an untrusted archive.
|
||||
|
||||
@item
|
||||
Do not let untrusted users access files extracted from untrusted
|
||||
archives without checking first for problems such as setuid programs.
|
||||
|
||||
@item
|
||||
Do not let untrusted users modify directories that are ancestors of
|
||||
top-level arguments of @command{tar}. For example, while you are
|
||||
executing @samp{tar -cf /archive/u-home.tar /u/home}, do not let an
|
||||
untrusted user modify @file{/}, @file{/archive}, or @file{/u}.
|
||||
|
||||
@item
|
||||
Pay attention to the diagnostics and exit status of @command{tar}.
|
||||
|
||||
@item
|
||||
When archiving live file systems, monitor running instances of
|
||||
@command{tar} to detect denial-of-service attacks.
|
||||
|
||||
@item
|
||||
Avoid unusual options such as @option{--absolute-names} (@option{-P}),
|
||||
@option{--dereference} (@option{-h}), @option{--overwrite},
|
||||
@option{--recursive-unlink}, and @option{--remove-files} unless you
|
||||
understand their security implications.
|
||||
|
||||
@end itemize
|
||||
|
||||
@node Changes
|
||||
@appendix Changes
|
||||
|
||||
|
||||
Reference in New Issue
Block a user