Problem reported by Boris Gjenero in:
https://lists.gnu.org/r/bug-tar/2022-11/msg00001.html
* src/update.c (append_file): Don’t assume that FILE_NAME is a
regular file whose size can be determined before reading.
Instead, simply read from the file until its end is reached.
This bug was introduced by the recent lseek-related changes.
* src/delete.c (delete_archive_members):
* src/update.c (update_archive):
Copy the member if acting as a filter, rather than lseeking over
it, which is possible if stdin is a regular file.
* src/list.c (skim_file, skim_member):
* src/sparse.c (sparse_skim_file):
New functions, for copying when a filter.
* src/list.c (skip_file): Remove; replaced with skim_file.
All callers changed.
(skip_member): Reimplement in terms of skim_member.
* src/sparse.c (sparse_skip_file):
Remove; replaced with sparse_skim_file. All callers changed.
* src/update.c (acting_as_filter): New static var.
(update_archive): Set it; this is like delete.c.
* tests/delete01.at (deleting a member after a big one):
* tests/delete02.at (deleting a member from stdin archive):
Also test filter case.
Updating global headers in update mode is not possible, because:
1) If the original archive was not in PAX format, writing the
global header would overwrite first member header (and eventually
data blocks) in the archive.
2) Otherwise, using the --pax-option can make the updated header
occupy more blocks than the original one, which would lead to the
same effect as in 1.
This also fixes
http://lists.gnu.org/archive/html/bug-tar/2018-12/msg00007.html
* src/xheader.c (xheader_forbid_global): New function.
* src/common.h (xheader_forbid_global): New prototype.
* src/update.c (update_archive): Use xheader_forbid_global, instead
of trying to write global extended header record.
* src/common.h (FALLTHROUGH): New macro, for use with gcc
-Wimplicit-fallthrough=5, which is now the default when used with
Gnulib after commit 2017-05-16T16:23:52!eggert@cs.ucla.edu
and with --enable-gcc-warnings
This lint was found by GCC 4.6.2 on Fedora 15 x86-64.
* src/buffer.c (buffer_write_global_xheader, mv_end, set_start_time)
(compute_duration, print_total_stats, flush_read, flush_write):
* src/checkpoint.c (checkpoint_finish_compile):
* src/list.c (test_archive_label):
* src/misc.c (chdir_count):
* src/names.c (const):
* src/unlink.c (finish_deferred_unlinks):
Define with (void) instead of with (), for slightly-better C type
checking and to avoid a GCC warning.
* src/compare.c (diff_dumpdir):
* src/tar.c (parse_owner_group): Remove unused local.
* src/misc.c (chdir_do):
* src/tar.c (add_exclude_array): Rename local to avoid shadowing.
(LOW_DENSITY_NUM, MID_DENSITY_NUM, HIGH_DENSITY_NUM):
Define only if needed.
* src/update.c (update_archive): Initialize a local; this fixes
what appears to be a real bug.
This closes another race condition, that occurs when overwriting a
symlink with a regular file.
* NEWS (--dereference consistency): New section.
* doc/tar.texi (Option Summary): Describe new --deference behavior.
(dereference): Likewise. Remove discussion that I didn't follow,
even before --dereference was changed.
* src/common.h (deref_stat, set_file_atime): Adjust signatures.
* src/compare.c (diff_file, diff_multivol): Respect open_read_flags
instead of rolling our own flags. This implements the new behavior
for --dereference.
(diff_file, diff_dumpdir): Likewise, for fstatat_flags.
* src/create.c: Adjust to set_file_atime signature change.
* src/extract.c (mark_after_links, file_newer_p, extract_dir):
Likewise.
* src/incremen.c (try_purge_directory): Likewise.
* src/misc.c (maybe_backup_file): Likewise.
* src/extract.c (file_newer_p): New arg STP. All callers changed.
(maybe_recoverable): New arg REGULAR. All callers changed.
Handle the case of overwriting a symlink with a regular file,
when --overwrite is specified but --dereference is not.
(open_output_file): Add O_CLOEXEC, O_NOCTTY, O_NONBLOCK for
consistency with file creation. Add O_NOFOLLOW if
overwriting_old_files && ! dereference_option.
* src/incremen.c (update_parent_directory): Use fstat, not fstatat;
there's less to go wrong.
* src/misc.c (deref_stat): Remove DEREF arg. All callers changed.
Instead, use fstatat_flags.
(set_file_atime): Remove ATFLAG arg. All callers changed.
Instead, use fstatat_flags.
* src/names.c, src/update.c: Adjust to deref_stat signature change.
* src/tar.c (get_date_or_file): Use stat, not deref_stat, as this
is not a file to be archived.
* tests/Makefile.am (TESTSUITE_AT): Add extrac13.at.
* tests/extrac13.at: New file.
* tests/testsuite.at: Include it.
This change replaces traditional functions like 'open' with the
POSIX.1-2008 functions like 'openat'. Mostly this is an internal
refactoring change, in preparation for further changes to close
some races.
* gnulib.modules: Add faccessat, linkat, mkfifoat, renameat, symlinkat.
Remove save-cwd.
* src/Makefile.am (tar_LDADD): Add $(LIB_EACCESS).
* tests/Makefile.am (LDADD): Likewise.
* src/common.h (chdir_fd): New extern var.
* src/compare.c (diff_file, diff_multivol): Use openat instead of open.
* src/create.c (create_archive, restore_parent_fd): Likewise.
* src/extract.c (create_placeholder_file): Likewise.
* src/names.c (collect_and_sort_names): Likewise.
* src/update.c (append_file): Likewise.
* src/compare.c (diff_symlink): Use readlinkat instead of readlink.
* src/compare.c (diff_file): Use chdir_fd instead of AT_FDCWD.
* src/create.c (subfile_open, dump_file0): Likewise.
* src/extract.c (fd_chmod, fd_chown, fd_stat, set_stat):
(repair_delayed_set_stat, apply_nonancestor_delayed_set_stat):
Likewise.
* src/extract.c (mark_after_links, file_newer_p, extract_dir):
(extract_link, apply_delayed_links):
Use fstatat rather than stat or lstat.
* src/misc.c (maybe_backup_file, deref_stat): Likewise.
* src/extract.c (make_directories): Use mkdirat rather than mkdir.
Use faccessat rather than access. This fixes a minor permissions
bug when tar is running setuid (who would want to do that?!).
(open_output_file): Use openat rather than open.
In the process, this removes support for Masscomp's O_CTG files,
which aren't compatible with openat's signature. Masscomp! Wow!
That's a blast from the past. As far as I know, that operating
system hasn't been supported for more than 20 years.
(extract_link, apply_delayed_links):
Use linkat rather than link.
(extract_symlink, apply_delayed_links):
Use symlinkat rather than symlink.
(extract_node): Use mknodat rather than mknod.
(extract_fifo): Use mkfifoat rather than mkfifo.
(apply_delayed_links): Use unlinkat rather than unlink or rmdir.
* src/misc.c (safer_rmdir, remove_any_file): Likewise.
* src/unlink.c (flush_deferred_unlinks): Likewise.
* src/extract.c (rename_directory): Use renameat rather than rename.
* src/misc.c (maybe_backup_file, undo_last_backup): Likewise.
* src/misc.c: Don't include <save-cwd.h>; no longer needed now
that we're using openat etc.
(struct wd): Add member fd. Remove members err and fd. All uses
changed.
(CHDIR_CACHE_SIZE): New constant.
(wdcache, wdcache_count, chdir_fd): New vars.
(chdir_do): Use openat rather than save_cwd. Keep the cache up
to date. This code won't scale well, but is good enough for now.
* src/update.c (update_archive): Use openat + fdopendir +
streamsavedir rather than savedir.
This file is a placeholder. It will be replaced with the actual ChangeLog
by make dist. Run make ChangeLog if you wish to create it earlier.
* NEWS: Document this.
* gnulib.modules: Add openat, readlinkat.
* src/common.h (open_read_flags, fstatat_flags): New global variables.
(cachedir_file_p, dump_file, check_exclusion_tags, scan_directory):
Adjust to new signatures, described below.
(name_fill_directory): Remove.
* src/compare.c (diff_file, diff_multivol): Use open_read_flags.
* src/create.c (struct exclusion_tag): Exclusion predicates now take
a file descriptor, not a file name.
(add_exclusion_tag): Likewise. All uses changed.
(cachedir_file_p): Likewise.
(check_exclusion_tags): The directory is now a file descriptor,
not a file name. All uses changed. Use openat for better traversal.
(file_dumpable_p): Arg is now a struct stat, not a struct
tar_stat_info. All uses changed. Check the arg's file types too.
(dump_dir0, dump_dir, dump_file0, dump_file): Omit top_level and
parent_device args, since st->parent tells us that now. All uses
changed.
(dump_dir): Likewise. Also, omit fd arg for similar reasons.
Apply fdsavedir to a dup of the file descriptor, since we need a
file descriptor for openat etc. as well, and fdsavedir (perhaps
unwisely) consumes its file descriptor when successful.
Do not consume st->fd when successful; this simplifies the caller.
(create_archive): Allocate a file descriptor when retraversing
a directory, during incremental dumps.
(dump_file0): Use fstatat, openat, and readlinkat for better traversal.
When opening a file, use the result of fstat on the file descriptor
rather than the fstatat on the directory entry, to avoid some race
conditions. No need to reopen the directory since we now no longer
close it. Change "did we open the file?" test from 0 <= fd to
0 < fd since fd == 0 now represents uninitialized.
(dump_file): Now accepts struct tar_stat_info describing parent,
not parent_device. Also, accept basename and fullname of entry.
All uses changed.
* src/incremen.c (update_parent_directory): Accept struct
tar_stat_info for parent, not name. All callers changed.
Use fstatat for safer directory traversal.
(procdir): Accept struct tar_stat_info, not struct stat and
dev_t, for info about directory. All callers changed.
(scan_directory): Accept struct tar_stat_info, not name,
device, and cmdline, for info about directory. All callers
changed. Do not consume the file descriptor, since caller
might need it. Use fstatat and openat for safer directory
traversal; also, use fstat after opening to double-check.
(name_fill_directory): Remove.
* src/names.c (add_hierarchy_to_namelist): Accept struct
tar_stat_info instead of device and cmdline. All callers changed.
When descending into a subdirectory, use openat and fstat for
safer directory traversal.
(collect_and_sort_names): Use open and fstat for safer directory
traversal. Set up struct tar_stat_info for callee's new API.
* src/tar.c (decode_options): Initialize open_read_flags
and fstatat_flags.
(tar_stat_destroy): Close st->fd if it is positive (not zero!).
* src/tar.h (struct tar_stat_info): New members parent, fd.
* src/update.c (update_archive): Adjust to dump_file's API change.
* tests/filerem02.at: Ignore stderr since its contents now depend
on the file system implementation.
* src/common.h (read_header_mode): New enum.
(read_header): Change type of the 3rd argument.
* src/list.c (read_header): Change type of the 3rd argument.
All callers updated.
* src/buffer.c (try_new_volume): Allow for volumes split at the
extended/ustar header boundary. This is against POSIX specs, but
we must be able to read such archives anyway.
* tests/multiv07.at: New test case.
* tests/Makefile.am: Add multiv07.at
* tests/testsuite.at: Likewise.
* src/compare.c: Update calls to read_header.
* src/delete.c: Likewise.
* src/update.c: Likewise.
* src/buffer.c (match_volume_label): Call set_volume_label.
(check_label_pattern): Get label string
as argument.
(match_volume_label): Handle volume labels stored in
global PAX headers.
* src/common.c (print_header,read_header): Change signature.
(read_header_primitive): Remove prototype.
* src/list.c (recent_global_header): New static.
(list_archive): Always print volume labels.
(read_header_primitive): Remove.
(read_header): Change the signature (all callers updated)
Save the recent global header.
(volume_label_printed): New static.
(simple_print_header): New function (ex-print_header).
(print_header): Change the signature (all callers updated).
For POSIX formats, print first volume header (if set).
* src/xheader.c (xheader_write_global): Write the data
accumulated in xhdr->stk even if keyword_global_override_list
is empty.
(xheader_read): On unexpected EOF, report error instead of
coredumping.
(XHDR_PROTECTED, XHDR_GLOBAL): New defines.
(struct xhdr_tab): Remove `protected' with `flags'. All uses updated.
(decg): If XHDR_GLOBAL bit is set, call the keyword's decode
method instead of adding it to `kwl'.
* src/compare.c: Update calls to read_header.
* src/create.c: Likewise.
* src/delete.c: Likewise.
* src/update.c: Likewise.
* src/extract.c: Likewise.
(extract_volhdr): Do not print "Reading <label>" statement, because
it is inconsistent: it is not printed if the volume begins with a
member continued from the previous volume.
* tests/label01.at: New testcase.
* tests/label02.at: New testcase.
* tests/Makefile.am, tests/testsuite.at: Add new testcases.
* src/common.h (namebuf_t): New typedef.
(namebuf_create, namebuf_free)
(namebuf_name): New prototypes.
(remname): New prototype.
* src/misc.c (struct namebuf): New structure.
(namebuf_create, namebuf_free)
(namebuf_name): New functions.
* src/create.c (dup_dir0): Remove is_avoided_name
checks. This is taken care of in update_archive.
* src/incremen.c (scan_directory): Use namebuf
to produce full file names.
* src/names.c (nametail): Remove extra level of
indirection. All uses updated.
(avoided_name_table, add_avoided_name)
(is_avoided_name): Remove.
* src/update.c (update_archive): Change algorithm.
Instead of adding unmodified files to the avoided_name
table, create namelist so that it contains only
modified files.
* tests/Makefile.am: Add update01.at, update02.at
* tests/testsuite.at: Likewise.
* tests/update.at (AT_KEYWORDS): Add update00.
Changes to src/create.c and src/incremen.c are partially
based on patch from Alexander Peslyak <solar at openwall.com>.
The new testcases require paxutils commit f653a2b or later.
* src/common.h (struct name): New member `cmdline'.
(dump_file): Change type of the 2nd argument to bool.
(file_removed_diag, dir_removed_diag): New prototypes.
(addname): New argument `cmdline'.
(name_from_list): Change return value.
* src/create.c (dump_dir0, dump_dir): top_level is bool.
(create_archive): Update calls to name_from_list.
Take advantage of the name->cmdline to set top_level argument
during incremental backups.
(dump_file0): top_level is bool.
Do not bail out if a no-top-level file disappears during incremental
backup, use file_removed_diag instead.
(dump_filed): top_level is bool.
* src/incremen.c (update_parent_directory): Silently ignore
ENOENT. It should have already been reported elsewhere.
(scan_directory): Use dir_removed_diag to report missing directories.
* src/misc.c (file_removed_diag, dir_removed_diag): New functions.
* src/names.c (name_gather): Set ->cmdname.
(addname): Likewise. All uses updated.
(name_from_list): Return struct name const *. All uses updated.
* tests/filerem01.at: New testcase.
* tests/filerem02.at: New testcase.
* tests/Makefile.am, tests/testsuite.at: Add filerem01.at, filerem02.at
* tests/grow.at, test/truncate.at: Use new syntax for genfile --run.
* NEWS: Update.
* doc/tar.texi: Minor fix.
(tar_timespec_cmp): New function. Wrapper over
timespec_cmp using the timespec precision provided by the
current archive format.
* src/common.h (tar_timespec_cmp): New declaration.
* src/compare.c (diff_file): Use tar_timespec_cmp.
* src/extract.c (file_newer_p): Likewise.
* src/update.c (update_archive): Likewise.
* tests/truncate.at: Reverted changes
* tests/update.at: Reverted changes
by struct stat; keep them to full nanosecond resolution.
This affects behavior only on older hosts or file systems
that have lower-resolution time stamps.
* src/common.h (OLDER_STAT_TIME): Parenthesize arg.
(OLDER_TAR_STAT_TIME): New macro.
(code_timespec): New function.
(BILLION, LOG10_BILLION, TIMESPEC_STRSIZE_BOUND): New constants.
* src/compare.c (diff_file): Use full time stamp resolution.
* src/create.c (start_header, dump_file0): Likewise.
(start_header, dump_file0): Adjust to new structure layout.
(dump_regular_finish): Simplify by using timespec_cmp.
* src/extract.c (struct delayed_set_stat): Don't store stat info
that we don't need, to save space. All uses changed.
(struct delayed_set_stat, struct delayed_link, file_newer_p):
(create_placeholder_file, extract_link, apply_delayed_links):
Use full time stamp resolution.
(check_time): Use code_timespec rather than rolling our own code.
(set_stat, delay_set_stat): Arg now points to tar_stat_info to
avoid losing time information. All callers changed.
* src/list.c (read_and, decode_header, print_heaeder):
Use full time stamp resolution.
* src/misc.c (code_timespec): New function.
* src/tar.h (struct tar_stat_info): Record atime, mtime, ctime
separately, for benefit of hosts with lower resolution.
* src/update.c (update_archive): Use full time stamp resolution.
* src/xheader.c (code_time): Use new code_timespec function
to simplify code.
(atime_coder, atime_decoder, ctime_coder, ctime_decoder):
(mtime_coder, mtime_decoder): Use full time stamp resolution.
Report time stamps to full resolution in environment.
Report memory allocation failures rather than ignoring them.
* src/system.c (time_to_env): New function.
(oct_to_env, str_to_env, chr_to_env): Report memory allocation failures.
(stat_to_env): Report full resolution in time stamps.