* src/buffer.c (flush_write_ptr, flush_bufmap, bufmap_locate):
(struct zip_magic, available_space_after, _flush_write)
(short_read, flush_archive, try_new_volume)
(gnu_add_multi_volume_header, simple_flush_read)
(simple_flush_write, _gnu_flush_read, _gnu_flush_write)
(gnu_flush_write): Prefer idx_t to size_t when either will do, as
signed types are typically safer. For a tiny value in memory,
just use ‘char’.
In common.h, replace macros with constants or functions when that
is easy. This makes code a bit more reliable (functions evaluate
their args exactly once) and easier to debug (many debugging
environments cannot access macros).
* src/common.h (CHKBLANKS): Remove. All uses removed.
(NAME_FIELD_SIZE, PREFIX_FIELD_SIZE, UNAME_FIELD_SIZE)
(GNAME_FIELD_SIZE, TAREXIT_SUCCESS, TAREXIT_DIFFERS)
(TAREXIT_FAILURE, LG_8, LG_256, DEFAULT_CHECKPOINT)
(MAX_OLD_FILES, TF_READ, TF_WRITE, TF_DELETED, XFORM_REGFILE)
(XFORM_LINK, XFORM_SYMLINK, XFORM_ALL, WARN_ALONE_ZERO_BLOCK)
(WARN_BAD_DUMPDIR, WARN_CACHEDIR, WARN_CONTIGUOUS_CAST)
(WARN_FILE_CHANGED, WARN_FILE_IGNORED, WARN_FILE_REMOVED)
(WARN_FILE_SHRANK, WARN_FILE_UNCHANGED, WARN_FILENAME_WITH_NULS)
(WARN_IGNORE_ARCHIVE, WARN_IGNORE_NEWER, WARN_NEW_DIRECTORY)
(WARN_RENAME_DIRECTORY, WARN_SYMLINK_CAST, WARN_TIMESTAMP)
(WARN_UNKNOWN_CAST, WARN_UNKNOWN_KEYWORD, WARN_XDEV)
(WARN_DECOMPRESS_PROGRAM, WARN_EXISTING_FILE, WARN_XATTR_WRITE)
(WARN_RECORD_SIZE, WARN_FAILED_READ, WARN_MISSING_ZERO_BLOCKS)
(WARN_VERBOSE_WARNINGS, WARN_ALL, EXCL_DEFAULT, EXCL_RECURSIVE)
(EXCL_NON_RECURSIVE): Now enum constants rather than macros.
(time_option_initialized, isfound, wasfound, warning_enabled):
Now functions rather than macros TIME_OPTION_INITIALIZED, ISFOUND,
WASFOUND, WARNING_ENABLED. All uses changed.
(OLDER_STAT_TIME, OLDER_TAR_STAT_TIME, EXTRACT_OVER_PIPE)
(TAR_ARGS_INITIALIZER): Remove. All uses replaced with their
definiens or equivalent.
This works better on platforms where SIZE_MAX < OFF_MAX.
* src/common.h (struct common locus):
* src/names.c (struct name_elt):
Use intmax_t for line numbers. All uses changed.
When parsing numbers prefer using strtosysint (renamed stoint)
to using strtoul and its variants.
This is simpler and faster and likely more reliable than
relying on quirks of the system strtoul etc,
and it standardizes how tar deals with parsing integers.
Among other things, the C standard and POSIX don’t specify
what strtol does to errno when conversions cannot be performed,
and it requires strtoul to support "-" before unsigned numbers.
* gnulib.modules (strtoimax, strtol, strtoumax, xstrtoimax):
Remove.
* src/checkpoint.c (checkpoint_compile_action, getwidth)
(format_checkpoint_string):
* src/incremen.c (read_incr_db_01, read_num)
* src/map.c (parse_id):
* src/misc.c (decode_timespec):
* src/sparse.c (decode_num):
* src/tar.c (parse_owner_group, parse_opt):
* src/transform.c (parse_transform_expr):
* src/xheader.c (decode_record, decode_signed_num)
(sparse_map_decoder):
Prefer stoint to strtol etc.
Don’t rely on errno == EINVAL as the standards don’t guarantee it.
* src/checkpoint.c (getwidth, format_checkpoint_string):
Check for invalid string suffix.
* src/checkpoint.c (getwidth):
Return intmax_t, not long. All callers changed.
* src/incremen.c (read_directory_file):
It’s just a one-digit number, so just subtract '0'.
* src/map.c (parse_id): Return bool not int. All callers changed.
* src/misc.c (stoint): Rename from strtosysint, and add
a bool * argument for reporting overflow. All callers changed.
(decode_timespec): Simplify by using ckd_sub rather than
checking for overflow by hand.
* src/tar.c (incremental_level): Now signed char to
emphasize that it can be only -1, 0, 1. All uses changed.
* src/xheader.c (decode_record): Avoid giant diagnostics.
Formerly the code could misbehave when the user specified a record
size greater than min (INT_MAX * 512 + 511, PTRDIFF_MAX, SSIZE_MAX).
* src/delete.c (new_blocks, delete_archive_members):
* src/system.c (sys_exec_info_script):
* src/tar.c (blocking_factor, record_size):
Don’t limit blocking factor to INT_MAX.
Prefer signed type for record_size.
Do not exceed IDX_MAX or SSIZE_MAX for record_size;
the SSIZE_MAX limit is needed so that ‘read’ and ‘write’
calls behave sensibly.
This is part of the general trend to prefer signed integer types,
to allow better runtime checking with -fsanitize=undefined etc.
* gnulib.modules: Remove strtoul. Add xstrtoimax.
* src/checkpoint.c (checkpoint, format_checkpoint_string):
* src/system.c (sys_exec_checkpoint_script):
* src/tar.c (checkpoint_option):
Use intmax_t, not unsigned, for checkpoint numbers.
All uses changed.
* src/checkpoint.c (checkpoint_compile_action): Don’t assume
time_t == unsigned long. Treat overflows as TYPE_MAXIMUM (time_t),
essentially infinity.
* src/tar.c (tar_sparse_major, tar_sparse_minor):
* src/tar.h (struct tar_stat_info):
Use intmax_t, not unsigned, for sparse major and minor.
All uses changed.
* src/tar.c (parse_opt):
Don’t mishandle multiple specifications of sparse major and minor.
* src/transform.c (struct transform):
Use idx_t, not unsigned, for match_number. All uses changed.
(parse_transform_expr): Don’t mishandle large match numbers
by wrapping them around.
* src/extract.c (struct delayed_set_stat, struct delayed_link):
* src/misc.c (normalize_filename, wd_count, chdir_count)
(chdir_arg, tar_getcdpath):
* src/names.c (name_gather, addname, add_hierarchy_to_namelist):
* src/unlink.c (struct deferred_unlink, flush_deferred_unlinks):
Use idx_t, not int, for directory indexes, so as to not
limit their number to INT_MAX; this is theoretically possible
if -T is used.
* src/names.c (name_next_elt, name_next):
Use bool for boolean.
It ports around issues that our handwritten code does not.
* gnulib.modules: Add xalignalloc.
* src/misc.c (ptr_align, page_aligned_alloc): Remove.
All page_aligned_alloc callers changed to use xalignalloc.
This is part of the general guideline that signed integer types
are safer.
* src/names.c (stripped_prefix_len): Return ptrdiff_t,
not size_t. All callers changed.
It’s now safe to assume support for C99 formats like %jd, so remove
some of the longwinded formatting code put in only to be portable to
pre-C99 platforms.
* gnulib.modules: Add intprops.
* src/buffer.c (format_total_stats, try_new_volume)
(write_volume_label):
* src/checkpoint.c (format_checkpoint_string):
* src/compare.c (verify_volume):
* src/create.c (to_chars_subst, dump_regular_file):
* src/incremen.c (read_num):
* src/list.c (read_and, from_header, simple_print_header)
(print_for_mkdir):
* src/sparse.c (sparse_dump_region):
* src/system.c (dec_to_env, sys_exec_info_script)
(sys_exec_checkpoint_script):
* src/xheader.c (out_of_range_header):
Prefer C99 formats like %jd and %ju to STRINGIFY_BIGINT.
* src/common.h: Sort includes.
Include intprops.h, verify.h. All other includes of verify.h
removed.
(intmax, uintmax): New functions and macros.
(STRINGIFY_BIGINT): Remove; no longer used.
(TIMESPEC_STRSIZE_BOUND): Make it 1 byte bigger, for negatives.
* src/create.c (MAX_VAL_WITH_DIGITS, to_base256):
Use *_WIDTH macros rather than assuming no padding bits.
Prefer UINTMAX_MAX to (uintmax_t) -1.
* src/list.c (tartime): Use strftime result rather
than running strlen later.
* src/misc.c (timetostr): New function. Prefer it when
printing time_t values.
Also, fix some rounding errors while we’re in the neighborhood.
* src/buffer.c (duration_ns, compute_duration_ns): Rename from
‘duration’ and ‘compute_duration’, and count ns rather than s, to
lessen rounding error. All uses changed.
(compute_duration_ns): Work even if the clock moves backward
and time_t is unsigned.
(print_stats): Don’t worry about null or empty TEXT, as that
cannot happen. Compare double to UINTMAX_MAX + 1.0, not
to UINTMAX_MAX, so that the comparison is exact.
Handle the unlikely case that numbytes >= UINTMAX_MAX.
* src/tar.c (parse_opt): Treat -L hugenumber as effectively
infinity rather than erroring out.
Prefer ckd_add to checking overflow by hand.
* gnulib.modules: Add assert-h, for static_assert.
* src/common.h, src/list.c, src/misc.c:
Prefer static_assert to #if + #error. This doesn’t fix any bugs; it’s
just that in general it’s better to avoid the preprocessor.
* src/common.h (GLOBAL): Remove this macro, and all its uses.
It collides with GCC 14 and -Wmissing-variable-declarations.
Change all uses of GLOBAL to use extern instead,
and declare the variables in their respective .c files.
Move .c file’s extern declarations here, so that they
appear only once and are checked against definitions.
* src/names.c (unconsumed_option_tail): Now static.
* src/suffix.c (find_compression_suffix): Always return stripped
archive name length in the last argument. Return 0 if there is no
suffix.
(find_compression_program): Remove.
(set_compression_program_by_suffix): Take third argument, controlling
whether to issue a warning if no suitable compression program is found
for the suffix.
* src/common.h (set_compression_program_by_suffix): Change prototype.
* src/buffer.c, src/tar.c: All uses of set_compression_program_by_suffix
changed.
Problem reported by Marc Espie in:
https://lists.gnu.org/r/bug-tar/2023-09/msg00003.html
* src/buffer.c (drop_volume_label_suffix):
Redo to not compute a pointer before the start of a buffer,
as this is not portable.
* src/common.h (name): New field: is_wildcard.
(name_scan): Change protoype.
* src/delete.c: Update calls to name_scan.
* src/names.c (addname, add_starting_file): Initialize is_wildcard.
(namelist_match): Take two arguments. If second one is true, return
only exact matches.
(name_scan): Likewise. All callers updated.
(name_from_list): Skip patterns.
* src/update.c (remove_exact_name): New function.
(update_archive): Do not remove matching name, if it is a pattern.
Instead, add a new entry with the matching file name.
* tests/update04.at: New test.
* tests/Makefile.am: Add new test.
* tests/testsuite.at: Include new test.
* NEWS: Update.
This also ports to C23 [[maybe_unused]].
* configure.ac (WARN_CFLAGS): Do not add -Wno-unused-parameter.
Add MAYBE_UNUSED where needed in source code.
Also, put it at the front where C23 requires it.
(In response to savannah bug #63574)
* doc/intern.texi: Document actual tar behaviour in regard to
missing end-of-file marker.
* doc/tar.texi: Rewrite the "warnings" section. Document
--warning=missing-zero-blocks
* src/common.h (WARN_MISSING_ZERO_BLOCKS): New constant.
(WARN_ALL): Include all warning bits.
* src/list.c (read_and): If EOF is reached without seeing end-of-file
blocks and the "missing-zero-blocks" warning is requested, warn about
the fact.
* src/warning.c: New warnings: "missing-zero-blocks", "verbose".
(warning_option): Change definition to reflect changes in common.h
* src/common.h: Include <inttostr.h> since paxutils no longer does.
(STRINGIFY_BIGINT): New macro, copied from older paxutils.
(UINTMAX_STRSIZE_BOUND): New constant, also from older paxutils.
This bug was introduced by the recent lseek-related changes.
* src/delete.c (delete_archive_members):
* src/update.c (update_archive):
Copy the member if acting as a filter, rather than lseeking over
it, which is possible if stdin is a regular file.
* src/list.c (skim_file, skim_member):
* src/sparse.c (sparse_skim_file):
New functions, for copying when a filter.
* src/list.c (skip_file): Remove; replaced with skim_file.
All callers changed.
(skip_member): Reimplement in terms of skim_member.
* src/sparse.c (sparse_skip_file):
Remove; replaced with sparse_skim_file. All callers changed.
* src/update.c (acting_as_filter): New static var.
(update_archive): Set it; this is like delete.c.
* tests/delete01.at (deleting a member after a big one):
* tests/delete02.at (deleting a member from stdin archive):
Also test filter case.
* src/buffer.c, src/delete.c: Do not include system-ioctl.h.
* src/buffer.c (guess_seekable_archive): Remove. This is now done
by get_archive_status, in a different way.
(get_archive_status): New function that gets archive_stat
unless remote, and sets seekable_archive etc.
(_open_archive): Prefer bool for boolean.
(_open_archive, new_volume): Get archive status consistently
by calling get_archive_status in both places.
* src/buffer.c (backspace_output):
* src/compare.c (verify_volume):
* src/delete.c (move_archive):
Let mtioseek worry about mtio.
* src/common.h (archive_stat): New global, replacing ar_dev and
ar_ino. All uses changed.
* src/delete.c (move_archive): Check for integer overflow.
Also report overflow if the archive position would go negative.
* src/system.c: Include system-ioctl.h, for MTIOCTOP etc.
(mtioseek): New function, which also checks for integer overflow.
(sys_save_archive_dev_ino): Remove.
(archive_stat): Now
(sys_get_archive_stat): Also initialize mtioseekable_archive.
(sys_file_is_archive): Don’t return true if the archive is /dev/null
since it’s not a problem in that case.
(sys_detect_dev_null_output): Cache dev_null_stat.
doc: omit MS-DOS mentions in doc
It’s really FAT32 we’re worried about now, not MS-DOS.
And doschk is no longer a GNU program.
* src/common.h (get_directory_entries):
Add _GL_ATTRIBUTE_MALLOC _GL_ATTRIBUTE_DEALLOC_FREE.
Problem found by gcc -Wsuggest-attribute=malloc and
current Gnulib.
* src/common.h (xheader_xattr_free,xheader_xattr_copy): Remove protos.
(xattr_map_init,xattr_map_copy)
(xattr_map_add,xattr_map_free): New protos.
* src/tar.h (xattr_map): New struct.
(tar_stat_info): Replace xattr_map_size and xattr_map with one
field: xattr_map.
* src/xattrs.c (XATTRS_PREFIX,XATTRS_PREFIX_LEN): New defines.
(xheader_xattr_init,xattr_map_init)
(xattr_map_free,xattr_map_add)
(xheader_xattr_add,xattr_map_copy): New functions.
All uses changed.
* src/create.c (start_header): Update to use struct xattr_map.
* src/extract.c: Update to use struct xattr_map.
* src/tar.c: Likewise.
* src/xheader.c (xheader_xattr_init,xheader_xattr_free)
(xheader_xattr_add,xheader_xattr_copy): Remove.
(xattr_coder,xattr_decoder): Use xattr_map_ functions.
With today’s compilers ‘inline’ is typically not needed for
performance (at least the way GNU Tar uses it) and it gets in the
way of portability.
* configure.ac: Omit AC_C_INLINE; no longer needed here.
* lib/attr-xattr.in.h (setxattr, lsetxattr, fsetxattr, getxattr)
(lgetxattr, fgetxattr, listxattr, llistxattr, flistxattr):
* lib/wordsplit.c (skip_delim_internal, skip_delim)
(skip_delim_real, exptab_matches):
* src/delete.c (flush_file):
* src/extract.c (safe_dir_mode):
* src/misc.c (ptr_align):
Now just static, not static inline.
* lib/wordsplit.h (wordsplit_getwords): Remove; no longer used.
* src/common.h (name_more_files): Now COMMON_INLINE, not
extern inline - which is not portable according to C99,
the way we were using it.
This can be helpful in porting to compilers like Oracle Developer
Studio that support some but not all GCC attributes.
* lib/wordsplit.c (FALLTHROUGH): Remove; now done by attribute.h.
* lib/wordsplit.h (__WORDSPLIT_ATTRIBUTE_FORMAT): Remove;
all uses replaced by ATTRIBUTE_FORMAT.
* lib/wordsplit.h, src/buffer.c, src/common.h, src/compare.c:
* src/sparse.c, src/system.c, src/xheader.c:
Prefer ATTRIBUTE_FORMAT, MAYBE_UNUSED, _Noreturn, etc. to
__attribute__.
Fedora 33 uses GCC 10.2.1, which is a bit pickier.
* configure.ac: Do not use -Wsystem-headers, as this
runs afoul of netdb.h on Fedora 33.
* gnulib.modules: Add ‘attribute’.
* lib/wordsplit.c (wsnode_new): Return the newly allocated
pointer instead of a boolean, to pacify GCC 10.2.1 which otherwise
complains about use of possibly-null pointers. All uses changed.
* src/buffer.c (try_new_volume): Don’t assume find_next_block succeeds.
(_write_volume_label): Pacify GCC 10.2.1 with an ‘assume’, since
LABEL must be nonnull here.
* src/common.h (FALLTHROUGH): Remove; now in attribute.h.
Include attribute.h, for ATTRIBUTE_NONNULL.
* src/misc.c (assign_string_or_null): New function,
taking over the old role of assign_string.
(assign_string): Assume VALUE is non-null.
(assign_null): New function, taking over the old
role of assign_string when its VALUE was nonnull.
All callers of assign_string changed to use these functions.
(assign_string_n): Clear *STRING if VALUE is null,
to fix a potential double-free.
Using such options as -f, -z, etc. is senseless in the file list file
and bypasses the option consistency checks in decode_options. Therefore,
only options related to file selection (a.k.a position-sensitive options)
are allowed in files.
* doc/tar.texi: Document changes.
* src/common.h (tar_args): Move from tar.c
(TAR_ARGS_INITIALIZER): New macro.
* src/names.c: Declare option group identifiers as an enum.
(names_parse_opt): Special handling for ARGP_KEY_ERROR.
(names_argp): Remove static qualifier.
(names_argp_children): Remove.
* src/tar.c: Declare option group identifiers as an enum.
(parse_opt): Special handling for ARGP_KEY_INIT.
(argp_children): New static variable.
(args): Remove static variable.
(more_options): Allow only options from names_argp.
(parse_default_options): Take a pointer to struct tar_args as argument.
Replace the loc member during the call to argp_parse and restore it
afterwards.
(decode_options): Use automatic variable for args.
* src/common.h (name_count): Remove extern.
(files_count): New enum.
(filename_args): New extern.
* src/names.c (name_count): Remove.
(files_count): New variable.
(name_add_name,name_add_file): Update filename_args.
* src/create.c (create_archive): Set trivial_link_count depending on
the filename_args.
The intent is to make binary-equivalent PAX archives easy to create. If
POSIXLY_CORRECT is set, the POSIX standard default is used, which embeds
the pid.
* src/common.h (posixly_correct): New global.
* src/tar.c (decode_options): Detect the POSIXLY_CORRECT environment
variable.
* src/buffer.c (add_chunk_header): Change filenames of multipart files to
omit the pid.
* src/xheader.c (HEADER_TEMPLATE): New macro.
(xheader_xhdr_name, xheader_ghdr_name): Use HEADER_TEMPLATE to select
the template for the POSIX extended header name.
* doc/tar.texi: Document the change.
Signed-off-by: Zachary Vance <za3k@za3k.com>