* src/misc.c (quote_copy_string, tar_savedir):
Use bool for booleans. All uses changed.
(quote_copy_string): Use char for chars.
(unquote_string): Return void, since nobody uses return value.
(unquote_string): Check for overflow in escapes like \777.
(wdcache): Now array of idx_t not int, since in theory it
might contain values greater than INT_MAX. All uses changed.
* src/list.c (decode_xform): Last arg is now int, not a void *
pointer to that int. All uses changed.
(enforce_one_top_level): Don’t assume string length fits in int.
(transform_stat_info): Prefer char to int for typeflag.
All uses changed.
(decode_header): Prefer bool for booleans. All uses changed.
(ugswidth): Now idx_t, not int, since in theory it could
exceed INT_MAX. All uses changed.
(simple_print_header, print_for_mkdir): Don’t assume printf length
fits in int, and similarly for length of user or group name.
* src/transform.c (transform_name_fp): Last arg is now int, not void *.
All uses changed.
* src/create.c (max_octal_val, to_octal, tar_copy_str)
(tar_name_copy_str, to_base256, to_chars_subst, to_chars)
(gid_to_chars, major_to_chars, minor_to_chars, mode_to_chars)
(off_to_chars, time_to_chars, uid_to_chars, string_to_chars)
(split_long_name, write_ustar_long_name, simple_finish_header):
* src/list.c (from_header, gid_from_header, major_from_header)
(minor_from_header, mode_from_header, off_from_header)
(time_from_header, uid_from_header):
Prefer int to idx_t where either will do because the buffer sizes
are known to be small, as this can be a performance win on 32-bit
platforms. Also, in a few cases the values were negative, whereas
idx_t is supposed to be nonnegative.
This increases the volume number maximum from 2**31 - 1 to 2**63 - 1.
* src/buffer.c (record_index, inhibit_map, new_volume):
Prefer bool to int for booleans.
* src/buffer.c (volno, global_volno):
* src/system.c (sys_exec_info_script):
Prefer intmax_t to int.
* src/buffer.c (increase_volume_number): Omit by-hand check for
overflow that relied on undefined behavior.
(new_volume): Check for that overflow here instead, without
relying on undefined behavior.
(print_stats): Avoid undefined behavior if printf sums overflow,
and reliably treat printf error like overflow.
* src/common.h (add_printf): New inline function.
Use types that are more specific than ‘int’, if that is easy.
* src/tar.c (after_date_option, xattrs_option, check_links_option)
(confirm, confirm_file_EOF, set_xattr_option, optloc_eq)
(get_date_or_file):
Prefer bool to int.
(tar_list_quoting_styles, tar_set_quoting_style, parse_opt):
Prefer idx_t to int.
(optloc_lookup, option_set_in_cl): Prefer enum option_class to int.
(decode_signal): Avoid some pointer reallocation.
(sort_mode_flag, hole_detection_types, set_old_files_option)
(is_subcommand_class): Prefer enum to int.
(parse_opt) [DEVICE_PREFIX]: Remove unused var.
Simplify creation of device name.
(find_argp_option_key, find_argp_option): Prefer char to int.
(enum subcommand_class): Now named.
(subcommand_class): Now char, not int.
(decode_options): Check for unlikely int overflow.
* src/common.h (INTMAX_STRSIZE_BOUND): New constant.
(SYSINT_BUFSIZE): Use it.
* src/xheader.c (global_header_count, xheader_format_name):
Prefer intmax_t to size_t, as the values are not sizes.
* src/tar.c (strip_name_components, archive_names)
(allocated_archive_names, tar_list_quoting_styles)
(expand_pax_option, parse_opt):
Prefer idx_t to size_t.
(decode_options): Use a static word rather than going
to to the bother of dynamically allocating an array.
(main): Do not preallocate array. Do not call ‘free’
on a pointer that now might be to static storage.
The 2024-08-09 Gnulib changes that caused some modules prefer
signed types to size_t means that Tar should follow suit.
* src/buffer.c (short_read):
* src/system.c (sys_child_open_for_compress)
(sys_child_open_for_uncompress):
rmtread and safe_read return ptrdiff_t not idx_t;
don’t rely on implementation defined conversion.
* src/misc.c (blocking_read): Never return a negative number.
Return idx_t, not ptrdiff_t, with the same convention for EOF
and error as the new full_read. All callers changed.
* src/sparse.c (sparse_dump_region, check_sparse_region)
(check_data_region):
* src/update.c (append_file):
full_read no longer returns SAFE_READ_ERROR for I/O error; instead it
returns the number of bytes successfully read, and sets errno.
Adjust to this.
* src/system.c (sys_child_open_for_uncompress):
Rewrite to avoid need for goto and label.
* src/buffer.c (flush_write_ptr, flush_bufmap, bufmap_locate):
(struct zip_magic, available_space_after, _flush_write)
(short_read, flush_archive, try_new_volume)
(gnu_add_multi_volume_header, simple_flush_read)
(simple_flush_write, _gnu_flush_read, _gnu_flush_write)
(gnu_flush_write): Prefer idx_t to size_t when either will do, as
signed types are typically safer. For a tiny value in memory,
just use ‘char’.
In common.h, replace macros with constants or functions when that
is easy. This makes code a bit more reliable (functions evaluate
their args exactly once) and easier to debug (many debugging
environments cannot access macros).
* src/common.h (CHKBLANKS): Remove. All uses removed.
(NAME_FIELD_SIZE, PREFIX_FIELD_SIZE, UNAME_FIELD_SIZE)
(GNAME_FIELD_SIZE, TAREXIT_SUCCESS, TAREXIT_DIFFERS)
(TAREXIT_FAILURE, LG_8, LG_256, DEFAULT_CHECKPOINT)
(MAX_OLD_FILES, TF_READ, TF_WRITE, TF_DELETED, XFORM_REGFILE)
(XFORM_LINK, XFORM_SYMLINK, XFORM_ALL, WARN_ALONE_ZERO_BLOCK)
(WARN_BAD_DUMPDIR, WARN_CACHEDIR, WARN_CONTIGUOUS_CAST)
(WARN_FILE_CHANGED, WARN_FILE_IGNORED, WARN_FILE_REMOVED)
(WARN_FILE_SHRANK, WARN_FILE_UNCHANGED, WARN_FILENAME_WITH_NULS)
(WARN_IGNORE_ARCHIVE, WARN_IGNORE_NEWER, WARN_NEW_DIRECTORY)
(WARN_RENAME_DIRECTORY, WARN_SYMLINK_CAST, WARN_TIMESTAMP)
(WARN_UNKNOWN_CAST, WARN_UNKNOWN_KEYWORD, WARN_XDEV)
(WARN_DECOMPRESS_PROGRAM, WARN_EXISTING_FILE, WARN_XATTR_WRITE)
(WARN_RECORD_SIZE, WARN_FAILED_READ, WARN_MISSING_ZERO_BLOCKS)
(WARN_VERBOSE_WARNINGS, WARN_ALL, EXCL_DEFAULT, EXCL_RECURSIVE)
(EXCL_NON_RECURSIVE): Now enum constants rather than macros.
(time_option_initialized, isfound, wasfound, warning_enabled):
Now functions rather than macros TIME_OPTION_INITIALIZED, ISFOUND,
WASFOUND, WARNING_ENABLED. All uses changed.
(OLDER_STAT_TIME, OLDER_TAR_STAT_TIME, EXTRACT_OVER_PIPE)
(TAR_ARGS_INITIALIZER): Remove. All uses replaced with their
definiens or equivalent.
This works better on platforms where SIZE_MAX < OFF_MAX.
* src/common.h (struct common locus):
* src/names.c (struct name_elt):
Use intmax_t for line numbers. All uses changed.
When parsing numbers prefer using strtosysint (renamed stoint)
to using strtoul and its variants.
This is simpler and faster and likely more reliable than
relying on quirks of the system strtoul etc,
and it standardizes how tar deals with parsing integers.
Among other things, the C standard and POSIX don’t specify
what strtol does to errno when conversions cannot be performed,
and it requires strtoul to support "-" before unsigned numbers.
* gnulib.modules (strtoimax, strtol, strtoumax, xstrtoimax):
Remove.
* src/checkpoint.c (checkpoint_compile_action, getwidth)
(format_checkpoint_string):
* src/incremen.c (read_incr_db_01, read_num)
* src/map.c (parse_id):
* src/misc.c (decode_timespec):
* src/sparse.c (decode_num):
* src/tar.c (parse_owner_group, parse_opt):
* src/transform.c (parse_transform_expr):
* src/xheader.c (decode_record, decode_signed_num)
(sparse_map_decoder):
Prefer stoint to strtol etc.
Don’t rely on errno == EINVAL as the standards don’t guarantee it.
* src/checkpoint.c (getwidth, format_checkpoint_string):
Check for invalid string suffix.
* src/checkpoint.c (getwidth):
Return intmax_t, not long. All callers changed.
* src/incremen.c (read_directory_file):
It’s just a one-digit number, so just subtract '0'.
* src/map.c (parse_id): Return bool not int. All callers changed.
* src/misc.c (stoint): Rename from strtosysint, and add
a bool * argument for reporting overflow. All callers changed.
(decode_timespec): Simplify by using ckd_sub rather than
checking for overflow by hand.
* src/tar.c (incremental_level): Now signed char to
emphasize that it can be only -1, 0, 1. All uses changed.
* src/xheader.c (decode_record): Avoid giant diagnostics.
Formerly the code could misbehave when the user specified a record
size greater than min (INT_MAX * 512 + 511, PTRDIFF_MAX, SSIZE_MAX).
* src/delete.c (new_blocks, delete_archive_members):
* src/system.c (sys_exec_info_script):
* src/tar.c (blocking_factor, record_size):
Don’t limit blocking factor to INT_MAX.
Prefer signed type for record_size.
Do not exceed IDX_MAX or SSIZE_MAX for record_size;
the SSIZE_MAX limit is needed so that ‘read’ and ‘write’
calls behave sensibly.
This is part of the general trend to prefer signed integer types,
to allow better runtime checking with -fsanitize=undefined etc.
* gnulib.modules: Remove strtoul. Add xstrtoimax.
* src/checkpoint.c (checkpoint, format_checkpoint_string):
* src/system.c (sys_exec_checkpoint_script):
* src/tar.c (checkpoint_option):
Use intmax_t, not unsigned, for checkpoint numbers.
All uses changed.
* src/checkpoint.c (checkpoint_compile_action): Don’t assume
time_t == unsigned long. Treat overflows as TYPE_MAXIMUM (time_t),
essentially infinity.
* src/tar.c (tar_sparse_major, tar_sparse_minor):
* src/tar.h (struct tar_stat_info):
Use intmax_t, not unsigned, for sparse major and minor.
All uses changed.
* src/tar.c (parse_opt):
Don’t mishandle multiple specifications of sparse major and minor.
* src/transform.c (struct transform):
Use idx_t, not unsigned, for match_number. All uses changed.
(parse_transform_expr): Don’t mishandle large match numbers
by wrapping them around.
* src/extract.c (struct delayed_set_stat, struct delayed_link):
* src/misc.c (normalize_filename, wd_count, chdir_count)
(chdir_arg, tar_getcdpath):
* src/names.c (name_gather, addname, add_hierarchy_to_namelist):
* src/unlink.c (struct deferred_unlink, flush_deferred_unlinks):
Use idx_t, not int, for directory indexes, so as to not
limit their number to INT_MAX; this is theoretically possible
if -T is used.
* src/names.c (name_next_elt, name_next):
Use bool for boolean.
It ports around issues that our handwritten code does not.
* gnulib.modules: Add xalignalloc.
* src/misc.c (ptr_align, page_aligned_alloc): Remove.
All page_aligned_alloc callers changed to use xalignalloc.
This is part of the general guideline that signed integer types
are safer.
* src/names.c (stripped_prefix_len): Return ptrdiff_t,
not size_t. All callers changed.
It’s now safe to assume support for C99 formats like %jd, so remove
some of the longwinded formatting code put in only to be portable to
pre-C99 platforms.
* gnulib.modules: Add intprops.
* src/buffer.c (format_total_stats, try_new_volume)
(write_volume_label):
* src/checkpoint.c (format_checkpoint_string):
* src/compare.c (verify_volume):
* src/create.c (to_chars_subst, dump_regular_file):
* src/incremen.c (read_num):
* src/list.c (read_and, from_header, simple_print_header)
(print_for_mkdir):
* src/sparse.c (sparse_dump_region):
* src/system.c (dec_to_env, sys_exec_info_script)
(sys_exec_checkpoint_script):
* src/xheader.c (out_of_range_header):
Prefer C99 formats like %jd and %ju to STRINGIFY_BIGINT.
* src/common.h: Sort includes.
Include intprops.h, verify.h. All other includes of verify.h
removed.
(intmax, uintmax): New functions and macros.
(STRINGIFY_BIGINT): Remove; no longer used.
(TIMESPEC_STRSIZE_BOUND): Make it 1 byte bigger, for negatives.
* src/create.c (MAX_VAL_WITH_DIGITS, to_base256):
Use *_WIDTH macros rather than assuming no padding bits.
Prefer UINTMAX_MAX to (uintmax_t) -1.
* src/list.c (tartime): Use strftime result rather
than running strlen later.
* src/misc.c (timetostr): New function. Prefer it when
printing time_t values.
Also, fix some rounding errors while we’re in the neighborhood.
* src/buffer.c (duration_ns, compute_duration_ns): Rename from
‘duration’ and ‘compute_duration’, and count ns rather than s, to
lessen rounding error. All uses changed.
(compute_duration_ns): Work even if the clock moves backward
and time_t is unsigned.
(print_stats): Don’t worry about null or empty TEXT, as that
cannot happen. Compare double to UINTMAX_MAX + 1.0, not
to UINTMAX_MAX, so that the comparison is exact.
Handle the unlikely case that numbytes >= UINTMAX_MAX.
* src/tar.c (parse_opt): Treat -L hugenumber as effectively
infinity rather than erroring out.
Prefer ckd_add to checking overflow by hand.
* gnulib.modules: Add assert-h, for static_assert.
* src/common.h, src/list.c, src/misc.c:
Prefer static_assert to #if + #error. This doesn’t fix any bugs; it’s
just that in general it’s better to avoid the preprocessor.
* src/common.h (GLOBAL): Remove this macro, and all its uses.
It collides with GCC 14 and -Wmissing-variable-declarations.
Change all uses of GLOBAL to use extern instead,
and declare the variables in their respective .c files.
Move .c file’s extern declarations here, so that they
appear only once and are checked against definitions.
* src/names.c (unconsumed_option_tail): Now static.
* src/suffix.c (find_compression_suffix): Always return stripped
archive name length in the last argument. Return 0 if there is no
suffix.
(find_compression_program): Remove.
(set_compression_program_by_suffix): Take third argument, controlling
whether to issue a warning if no suitable compression program is found
for the suffix.
* src/common.h (set_compression_program_by_suffix): Change prototype.
* src/buffer.c, src/tar.c: All uses of set_compression_program_by_suffix
changed.
Problem reported by Marc Espie in:
https://lists.gnu.org/r/bug-tar/2023-09/msg00003.html
* src/buffer.c (drop_volume_label_suffix):
Redo to not compute a pointer before the start of a buffer,
as this is not portable.
* src/common.h (name): New field: is_wildcard.
(name_scan): Change protoype.
* src/delete.c: Update calls to name_scan.
* src/names.c (addname, add_starting_file): Initialize is_wildcard.
(namelist_match): Take two arguments. If second one is true, return
only exact matches.
(name_scan): Likewise. All callers updated.
(name_from_list): Skip patterns.
* src/update.c (remove_exact_name): New function.
(update_archive): Do not remove matching name, if it is a pattern.
Instead, add a new entry with the matching file name.
* tests/update04.at: New test.
* tests/Makefile.am: Add new test.
* tests/testsuite.at: Include new test.
* NEWS: Update.