Commit Graph

265 Commits

Author SHA1 Message Date
Paul Eggert
75b03fdff4 Use openat2 to jailify the extraction directory
This addresses CVE-2025-45582.
* gnulib.modules: Add openat2.
* src/misc.c (open_subdir): New static function.
(fdbase_opendir): Use it.
* src/tar.c (open_searchdir_how): New var, replacing and
augmenting open_searchdir_flags.  All uses changed.
* tests/extrac31.at: New file.
* tests/Makefile (TESTSUITE_AT), tests/testuite.at: Add it.
2025-11-15 15:10:48 -08:00
Paul Eggert
bdd773d028 Cache parent directories
Although this might help (or hurt) performance, the main
motivation is to make it easier in future commits
to prevent tarballs from escaping the extraction directory.
* src/common.h: (BADFD): New constant.
(struct fdbase): New type.
* src/create.c (dump_file0): Use parent->fd instead of caching
it into a local, as the latter approach is now awkward.
* src/extract.c (extract_link): Don’t save errno unless needed.
* src/misc.c (safer_rmdir): New arg F.  All callers changed.
(maybe_backup_file): Construct full after_backup_name, now
that find_backup_file_name no longer does that for us.
(chdir_fd): Now static not extern, as other modules now use fdbase.
(fdbase_cache): New static var.
(fdbase_clear): New function.  Call it whenever removing
or renaming directories or symlinks to directories.
(fdbase_opendir): New static function.
(fdbase, fdbase1): New functions.  Call them whenever the
code formerly passed chdir_fd to a syscall.
2025-11-15 15:10:48 -08:00
Paul Eggert
a109947a78 Make xclose static
* src/buffer.c (xclose): Move from here ...
* src/system.c: ... to here, and make it static.
2025-11-15 15:10:48 -08:00
Paul Eggert
7c241126f1 Refactor to avoid duplication in "./" scanning
* src/exclist.c (excluded_name):
* src/misc.c (normalize_filename_x, must_be_dot_or_slash)
(chdir_arg):
Use dotslash or dotslashlen instead of doing things by hand.
* src/misc.c (slashlen, dotslashlen): New functions.
(safer_rmdir): Do not worry about unlinkat with AT_REMOVEDIR
succeeding on ".", as POSIX prohibits it, and it does not succeed
on any known platform. This simplifies the file name test.
Continue to worry about "/" though, as POSIX does allow
it to be removed.
2025-11-15 15:10:48 -08:00
Paul Eggert
56fb4a96ca chdir_id refactoring
This prepares for future changes that need directory IDs.
* src/common.h (struct chdir_id): New struct.
* src/extract.c (extract_dir): Use chdir_id to avoid duplicate stats.
* src/misc.c (struct wd): New member ID.
(grow_wd): New function, extracted from chdir_arg and that
also initializes id.err.
(chdir_arg): Use it.  Initialize id.err.
(chdir_id): New function.
2025-11-15 15:10:48 -08:00
Paul Eggert
58b471f14a Omit duplicate declaration of ‘usage’ 2025-11-15 15:10:48 -08:00
Paul Eggert
ca02de4050 Avoid overrun when converting ns-resolution timestamps to text
Caught by gcc -fsanitize=address.
Inspired by Matthias Andree’s bug report in:
https://lists.gnu.org/r/bug-tar/2025-08/msg00019.html
though I found this bug via a simple "make check"
with sanitization enabled.
* src/common.h (TIMESPEC_STRSIZE_BOUND):
Make room for leading '-', needed in addition to the '-' room
supplied by SYSINT_BUFSIZE due to the way code_timespec works.
2025-08-18 17:14:49 -07:00
Paul Eggert
bdc442bd5c Use Gnulib’s same-inode module
This is more portable to non-POSIX systems.
However, don’t bother trying to port to systems
where st_ino is not a scalar of type dev_t,
as these systems no longer seem to be active targets
and it’s not worth the maintenance hassle.
* gnulib.modules: Add same-inode, now that we use it
explicitly rather than indirectly.
* src/compare.c (diff_link):
* src/create.c (compare_links, restore_parent_fd):
* src/incremen.c (compare_directory_meta, procdir):
* src/extract.c (dl_compare, repair_delayed_set_stat)
(apply_nonancestor_delayed_set_stat, extract_link)
(apply_delayed_link):
* src/names.c (add_file_id):
* src/system.c (sys_file_is_archive, sys_detect_dev_null_output):
Include same-inode.h, and prefer its macros and functions
to doing things by hand.
* src/create.c (struct link):
* src/extract.c (struct delayed_set_stat, struct delayed_link):
* src/incremen.c (struct directory):
* src/names.c (struct file_id_list):
Rename members to st_dev and st_ino so that SAME_INODE and
PSAME_INODE can be used on the type.  All uses changed.
* src/system.c (sys_compare_links): Remove.
All uses replaced by psame_inode.
2025-08-14 10:27:28 -07:00
Paul Eggert
75735940f1 Port more code to UBSan, and fix alignment bug
Problem with extract_file reported by Kirill Furman in:
https://lists.gnu.org/r/bug-tar/2025-07/msg00003.html
Since the UBSan thing seems to be a recurring issue,
I fixed other instances of the problem that I found.
Also, I noticed that the same line of code had another failure to
conform to C23’s rules for pointers (an alignment issue not caught
by UBSan), so I fixed that too.  None of these issues matter on
practical production hosts.
* src/common.h (charptr): New function.
* src/buffer.c (available_space_after, short_read, flush_archive)
(backspace_output, try_new_volume, simple_flush_read)
(_gnu_flush_read, _gnu_flush_write):
* src/compare.c (read_and_process):
* src/create.c (write_eot, write_gnu_long_link)
(dump_regular_file, dump_dir0):
* src/extract.c (extract_file):
* src/incremen.c (get_gnu_dumpdir):
* src/list.c (read_header):
* src/sparse.c (sparse_dump_region, sparse_extract_region):
* src/system.c (sys_write_archive_buffer)
(sys_child_open_for_compress, sys_child_open_for_uncompress):
* src/update.c (append_file, update_archive):
Use it.
* src/buffer.c (set_next_block_after): Arg is now void *,
not union block *, since it need not be a valid union block * pointer
and this can matter on unusual or debugging implementations.
Turn a loop into an if so that the code is O(1) not O(N).
2025-07-26 02:20:53 -07:00
Sergey Poznyakoff
9324b472b0 Minor changes 2025-05-13 17:59:15 +03:00
Sergey Poznyakoff
6131dd2805 Skip file or archive member if its transformed name is empty.
* NEWS: Document changes.
* doc/tar.texi: Document changes.
* src/common.h (transform_stat_info): Change return value.
(transform_name_fp): Change signature.
(WARN_EMPTY_TRANSFORM): New constant.
* src/create.c: Check return from transform_name.  Skip file, if it
is false.
* src/list.c (transform_stat_info): Return bool.
(read_and): Skip member if transform_stat_info returns false.
* src/transform.c (_transform_name_to_obstack): Change return type.
Always allocate result in obstack.
(transform_name_fp): Change arguments.  Return true on
success (transformed string not empty).  Otherwise return false and
don't change the source string.
* src/warning.c: New warning class: empty-transform.
* tests/extrac17.at: Use --warning=empty-transform.
2025-05-06 15:32:17 +03:00
Paul Eggert
0aa991f386 Update copyright years
UPDATE_COPYRIGHT_USE_INTERVALS=1 \
$HOME/src/gnu/gnulib/build-aux/update-copyright \
  $(git ls-files | sed -e '/^gnulib$/d
			   /^paxutils$/d
			   /^COPYING$/d
			   /\/fdl.texi$/d')
sed -i '2000,${
    /^Copyright @copyright/d
    s/^[0-9]*--\(2025 Free Software Foundation, Inc.\)/Copyright (C) \1/
  }' doc/tar.texi
2025-01-01 18:33:10 -08:00
Paul Eggert
005f2916b6 Improve common.h comment 2024-11-02 13:43:05 -07:00
Paul Eggert
04c1b85872 Prefer other types to int in system.c
* src/system.c (is_regular_file, sys_exec_setmtime_script):
Prefer bool for boolean.
(sys_exec_command): Prefer char for char.
2024-11-01 23:47:23 -07:00
Paul Eggert
f96aff3ce9 Prefer other types to int in misc.c
* src/misc.c (quote_copy_string, tar_savedir):
Use bool for booleans.  All uses changed.
(quote_copy_string): Use char for chars.
(unquote_string): Return void, since nobody uses return value.
(unquote_string): Check for overflow in escapes like \777.
(wdcache): Now array of idx_t not int, since in theory it
might contain values greater than INT_MAX.  All uses changed.
2024-11-01 23:47:23 -07:00
Paul Eggert
53a3691092 Prefer other types to int in map.c
* src/map.c (map_read): Prefer bool for booleans.
(owner_map_translate, group_map_translate):
Return void, not int, as nobody uses the return value.
2024-11-01 23:47:23 -07:00
Paul Eggert
91ad4ea343 Fix some uses of int in list.c
* src/list.c (decode_xform): Last arg is now int, not a void *
pointer to that int.  All uses changed.
(enforce_one_top_level): Don’t assume string length fits in int.
(transform_stat_info): Prefer char to int for typeflag.
All uses changed.
(decode_header): Prefer bool for booleans.  All uses changed.
(ugswidth): Now idx_t, not int, since in theory it could
exceed INT_MAX.  All uses changed.
(simple_print_header, print_for_mkdir): Don’t assume printf length
fits in int, and similarly for length of user or group name.
* src/transform.c (transform_name_fp): Last arg is now int, not void *.
All uses changed.
2024-11-01 23:47:23 -07:00
Paul Eggert
bde3e8d663 Prefer int to idx_t for some small sizes
* src/create.c (max_octal_val, to_octal, tar_copy_str)
(tar_name_copy_str, to_base256, to_chars_subst, to_chars)
(gid_to_chars, major_to_chars, minor_to_chars, mode_to_chars)
(off_to_chars, time_to_chars, uid_to_chars, string_to_chars)
(split_long_name, write_ustar_long_name, simple_finish_header):
* src/list.c (from_header, gid_from_header, major_from_header)
(minor_from_header, mode_from_header, off_from_header)
(time_from_header, uid_from_header):
Prefer int to idx_t where either will do because the buffer sizes
are known to be small, as this can be a performance win on 32-bit
platforms.  Also, in a few cases the values were negative, whereas
idx_t is supposed to be nonnegative.
2024-11-01 23:47:23 -07:00
Paul Eggert
a337cd35a0 Prefer other types to int in buffer.c
This increases the volume number maximum from 2**31 - 1	to 2**63 - 1.
* src/buffer.c (record_index, inhibit_map, new_volume):
Prefer bool to int for booleans.
* src/buffer.c (volno, global_volno):
* src/system.c (sys_exec_info_script):
Prefer intmax_t to int.
* src/buffer.c (increase_volume_number): Omit by-hand check for
overflow that relied on undefined behavior.
(new_volume): Check for that overflow here instead, without
relying on undefined behavior.
(print_stats): Avoid undefined behavior if printf sums overflow,
and reliably treat printf error like overflow.
* src/common.h (add_printf): New inline function.
2024-11-01 23:47:23 -07:00
Paul Eggert
5a7185ae31 Prefer other types to int in tar.c
Use types that are more specific than ‘int’, if that is easy.
* src/tar.c (after_date_option, xattrs_option, check_links_option)
(confirm, confirm_file_EOF, set_xattr_option, optloc_eq)
(get_date_or_file):
Prefer bool to int.
(tar_list_quoting_styles, tar_set_quoting_style, parse_opt):
Prefer idx_t to int.
(optloc_lookup, option_set_in_cl): Prefer enum option_class to int.
(decode_signal): Avoid some pointer reallocation.
(sort_mode_flag, hole_detection_types, set_old_files_option)
(is_subcommand_class): Prefer enum to int.
(parse_opt) [DEVICE_PREFIX]: Remove unused var.
Simplify creation of device name.
(find_argp_option_key, find_argp_option): Prefer char to int.
(enum subcommand_class): Now named.
(subcommand_class): Now char, not int.
(decode_options): Check for unlikely int overflow.
2024-11-01 23:47:23 -07:00
Paul Eggert
0aa69501d3 Remove major, minor signedness assumption
* src/common.h (uintmax): Remove; no longer used.
* src/list.c (simple_print_header): Don’t assume major and minor
agree in signedness.
2024-11-01 23:47:23 -07:00
Paul Eggert
d9da938963 Prefer intmax_t for occurrence counts
* src/common.h (struct name):
* src/tar.c (occurrence_option, parse_opt):
Use intmax_t, not uintmax_t, for occurrence counts.
2024-11-01 23:47:23 -07:00
Paul Eggert
d68c37b640 Prefer off_t to uintmax_t for continued_file_*
* src/buffer.c (continued_file_size, continued_file_offset):
Now off_t, not uintmax_t.  All uses changed.
* src/common.h (UINTMAX_FROM_HEADER):
* src/list.c (uintmax_from_header):
Remove; unused.
* src/list.c (simple_print_header):
* src/xheader.c (volume_size_decoder, volume_offset_decoder):
Treat offset as off_t, not uintmax_t.
2024-11-01 23:47:23 -07:00
Paul Eggert
c0ef66da92 Prefer idx_t to size_t in common.h
* src/common.h (struct name): Prefer idx_t to size_t.
(volume_label_count): Remove; unused.
2024-11-01 23:47:23 -07:00
Paul Eggert
025f19e6bd Prefer intmax_t to size_t in xheader.c
* src/common.h (INTMAX_STRSIZE_BOUND): New constant.
(SYSINT_BUFSIZE): Use it.
* src/xheader.c (global_header_count, xheader_format_name):
Prefer intmax_t to size_t, as the values are not sizes.
2024-11-01 23:47:23 -07:00
Paul Eggert
303ac16ec0 Prefer idx_t to size_t in tar.c
* src/tar.c (strip_name_components, archive_names)
(allocated_archive_names, tar_list_quoting_styles)
(expand_pax_option, parse_opt):
Prefer idx_t to size_t.
(decode_options): Use a static word rather than going
to to the bother of dynamically allocating an array.
(main): Do not preallocate array.  Do not call ‘free’
on a pointer that now might be to static storage.
2024-11-01 23:47:23 -07:00
Paul Eggert
6df7a72434 Prefer idx_t to size_t in system.c
* src/buffer.c (_flush_write): Return idx_t, not ssize_t,
to accommodate system.c changes.  All uses changed.
(_gnu_flush_write): Output correct errno value after write error.
Simplify multi-volume mode.
* src/system.c (sys_write_archive_buffer)
(sys_child_open_for_compress, sys_exec_setmtime_script):
Prefer idx_t to size_t.
2024-11-01 23:47:23 -07:00
Paul Eggert
7c0feaefd0 Adjust better to Gnulib signed-int read changes
The 2024-08-09 Gnulib changes that caused some modules prefer
signed types to size_t means that Tar should follow suit.
* src/buffer.c (short_read):
* src/system.c (sys_child_open_for_compress)
(sys_child_open_for_uncompress):
rmtread and safe_read return ptrdiff_t not idx_t;
don’t rely on implementation defined conversion.
* src/misc.c (blocking_read): Never return a negative number.
Return idx_t, not ptrdiff_t, with the same convention for EOF
and error as the new full_read.  All callers changed.
* src/sparse.c (sparse_dump_region, check_sparse_region)
(check_data_region):
* src/update.c (append_file):
full_read no longer returns SAFE_READ_ERROR for I/O error; instead it
returns the number of bytes successfully read, and sets errno.
Adjust to this.
* src/system.c (sys_child_open_for_uncompress):
Rewrite to avoid need for goto and label.
2024-11-01 23:47:23 -07:00
Paul Eggert
61a978f6d4 Remove name_term
It’s never actually called.
* src/names.c (name_term): Remove.  All uses removed.
2024-11-01 23:47:23 -07:00
Paul Eggert
5704e5795a Fewer uses of size_t in names.c
* src/names.c (name_buffer_length, read_name_from_file)
(copy_name, all_names_found, add_hierarchy_to_namelist)
(rebase_child_list, make_file_name, stripped_prefix_len):
Prefer idx_t to size_t.
2024-11-01 23:47:23 -07:00
Paul Eggert
b73127edc4 Fewer uses of size_t in misc.c
* src/misc.c (assign_string_n, quote_copy_string)
(normalize_filename, replace_prefix, remove_any_file)
(blocking_read, wd_alloc, wdcache_count, chdir_arg, chdir_do)
(read_diag_details, struct namebuf, namebuf_name):
Prefer idx_t to size_t.
2024-11-01 23:47:23 -07:00
Paul Eggert
dd71d3796d Fewer uses of size_t in list.c
* src/list.c (recent_long_name_blocks, recent_long_link_blocks)
(read_header, from_header, gid_from_header, major_from_header)
(minor_from_header, mode_from_header, off_from_header)
(time_from_header, uid_from_header, uintmax_from_header)
(tartime): Prefer idx_t to size_t.
2024-11-01 23:47:23 -07:00
Paul Eggert
f73c927a71 Fewer uses of size_t in incremen.c
* src/incremen.c (struct dumpdir, dumpdir_create0, struct dumpdir_iter)
(dumpdir_next, dumpdir_size, make_directory)
(dirlist_replace_prefix, rebase_directory, makedumpdir)
(maketagdumpdir, append_incremental_renames, read_obstack)
(read_incr_db_2, get_gnu_dumpdir, try_purge_directory)
(list_dumpdir): Prefer idx_t to size_t.
2024-11-01 23:47:23 -07:00
Paul Eggert
849f244a0b Fewer uses of size_t in create.c
* src/create.c (struct exclusion_tag, to_octal, tar_copy_str)
(tar_name_copy_str, to_base256, to_chars_subst, to_chars)
(gid_to_chars, major_to_chars, minor_to_chars, mode_to_chars)
(off_to_chars, time_to_chars, uid_to_chars, uintmax_to_chars)
(string_to_chars, start_private_header, write_gnu_long_link)
(split_long_name, write_ustar_long_name, simple_finish_header)
(dump_dir0, ensure_slash, create_archive):
Prefer idx_t to size_t.
2024-11-01 23:47:23 -07:00
Paul Eggert
78dd7bf0bc Fewer uses of size_t in buffer.c
* src/buffer.c (flush_write_ptr, flush_bufmap, bufmap_locate):
(struct zip_magic, available_space_after, _flush_write)
(short_read, flush_archive, try_new_volume)
(gnu_add_multi_volume_header, simple_flush_read)
(simple_flush_write, _gnu_flush_read, _gnu_flush_write)
(gnu_flush_write): Prefer idx_t to size_t when either will do, as
signed types are typically safer.  For a tiny value in memory,
just use ‘char’.
2024-11-01 23:47:23 -07:00
Paul Eggert
4323e98683 Fewer macros in common.h
In common.h, replace macros with constants or functions when that
is easy.  This makes code a bit more reliable (functions evaluate
their args exactly once) and easier to debug (many debugging
environments cannot access macros).
* src/common.h (CHKBLANKS): Remove.  All uses removed.
(NAME_FIELD_SIZE, PREFIX_FIELD_SIZE, UNAME_FIELD_SIZE)
(GNAME_FIELD_SIZE, TAREXIT_SUCCESS, TAREXIT_DIFFERS)
(TAREXIT_FAILURE, LG_8, LG_256, DEFAULT_CHECKPOINT)
(MAX_OLD_FILES, TF_READ, TF_WRITE, TF_DELETED, XFORM_REGFILE)
(XFORM_LINK, XFORM_SYMLINK, XFORM_ALL, WARN_ALONE_ZERO_BLOCK)
(WARN_BAD_DUMPDIR, WARN_CACHEDIR, WARN_CONTIGUOUS_CAST)
(WARN_FILE_CHANGED, WARN_FILE_IGNORED, WARN_FILE_REMOVED)
(WARN_FILE_SHRANK, WARN_FILE_UNCHANGED, WARN_FILENAME_WITH_NULS)
(WARN_IGNORE_ARCHIVE, WARN_IGNORE_NEWER, WARN_NEW_DIRECTORY)
(WARN_RENAME_DIRECTORY, WARN_SYMLINK_CAST, WARN_TIMESTAMP)
(WARN_UNKNOWN_CAST, WARN_UNKNOWN_KEYWORD, WARN_XDEV)
(WARN_DECOMPRESS_PROGRAM, WARN_EXISTING_FILE, WARN_XATTR_WRITE)
(WARN_RECORD_SIZE, WARN_FAILED_READ, WARN_MISSING_ZERO_BLOCKS)
(WARN_VERBOSE_WARNINGS, WARN_ALL, EXCL_DEFAULT, EXCL_RECURSIVE)
(EXCL_NON_RECURSIVE): Now enum constants rather than macros.
(time_option_initialized, isfound, wasfound, warning_enabled):
Now functions rather than macros TIME_OPTION_INITIALIZED, ISFOUND,
WASFOUND, WARNING_ENABLED.  All uses changed.
(OLDER_STAT_TIME, OLDER_TAR_STAT_TIME, EXTRACT_OVER_PIPE)
(TAR_ARGS_INITIALIZER): Remove.  All uses replaced with their
definiens or equivalent.
2024-08-19 09:57:13 -07:00
Paul Eggert
f1e4947992 Fix string size bound calculation
* src/common.h (UINTMAX_STRSIZE_BOUND):
Fix typo that luckily didn’t break anything.
2024-08-19 09:57:13 -07:00
Paul Eggert
0dfcfa4aa4 maint: switch from ERROR to paxerror etc
Prefer functions like ‘paxerror’ to macros like ‘ERROR’.
The functions have cleaner semantics, and calls are
easier to read.
2024-08-19 09:57:13 -07:00
Paul Eggert
541f3bc374 Fix duplicate write_error_details decl
* src/common.h (write_error_details): Remove decl
that belongs to paxutils.
2024-08-14 23:25:46 -07:00
Paul Eggert
ef290cb171 Use idx_t, not size_t, for xattr value lengths.
* src/tar.h (struct xattr_map):
* src/xattrs.c (xattr_map_add): Prefer idx_t to size_t.  All uses
changed.
2024-08-14 23:25:46 -07:00
Paul Eggert
09aec02e32 Use intmax_t, not size_t, for input line numbers
This works better on platforms where SIZE_MAX < OFF_MAX.
* src/common.h (struct common locus):
* src/names.c (struct name_elt):
Use intmax_t for line numbers.  All uses changed.
2024-08-14 23:25:46 -07:00
Paul Eggert
b3992e4ef8 Prefer signed types in blocking_read etc
* src/compare.c (process_noop, process_rawdata):
Return bool, not int.
* src/compare.c (process_noop, process_rawdata):
* src/create.c (dump_regular_file):
* src/extract.c (extract_file):
* src/misc.c (blocking_read, blocking_write):
* src/sparse.c (sparse_scan_file_raw, sparse_extract_region):
Prefer signed types like idx_t to unsigned ones like size_t.
(sparse_scan_file_raw): Diagnose read errors.
2024-08-14 23:25:46 -07:00
Paul Eggert
d1e72a536f Prefer stoint to strtoul and variants
When parsing numbers prefer using strtosysint (renamed stoint)
to using strtoul and its variants.
This is simpler and faster and likely more reliable than
relying on quirks of the system strtoul etc,
and it standardizes how tar deals with parsing integers.
Among other things, the C standard and POSIX don’t specify
what strtol does to errno when conversions cannot be performed,
and it requires strtoul to support "-" before unsigned numbers.
* gnulib.modules (strtoimax, strtol, strtoumax, xstrtoimax):
Remove.
* src/checkpoint.c (checkpoint_compile_action, getwidth)
(format_checkpoint_string):
* src/incremen.c (read_incr_db_01, read_num)
* src/map.c (parse_id):
* src/misc.c (decode_timespec):
* src/sparse.c (decode_num):
* src/tar.c (parse_owner_group, parse_opt):
* src/transform.c (parse_transform_expr):
* src/xheader.c (decode_record, decode_signed_num)
(sparse_map_decoder):
Prefer stoint to strtol etc.
Don’t rely on errno == EINVAL as the standards don’t guarantee it.
* src/checkpoint.c (getwidth, format_checkpoint_string):
Check for invalid string suffix.
* src/checkpoint.c (getwidth):
Return intmax_t, not long.  All callers changed.
* src/incremen.c (read_directory_file):
It’s just a one-digit number, so just subtract '0'.
* src/map.c (parse_id): Return bool not int.  All callers changed.
* src/misc.c (stoint): Rename from strtosysint, and add
a bool * argument for reporting overflow.  All callers changed.
(decode_timespec): Simplify by using ckd_sub rather than
checking for overflow by hand.
* src/tar.c (incremental_level): Now signed char to
emphasize that it can be only -1, 0, 1.  All uses changed.
* src/xheader.c (decode_record): Avoid giant diagnostics.
2024-08-14 23:25:46 -07:00
Paul Eggert
3ffe2eb073 Handle enormous record sizes better
Formerly the code could misbehave when the user specified a record
size greater than min (INT_MAX * 512 + 511, PTRDIFF_MAX, SSIZE_MAX).
* src/delete.c (new_blocks, delete_archive_members):
* src/system.c (sys_exec_info_script):
* src/tar.c (blocking_factor, record_size):
Don’t limit blocking factor to INT_MAX.
Prefer signed type for record_size.
Do not exceed IDX_MAX or SSIZE_MAX for record_size;
the SSIZE_MAX limit is needed so that ‘read’ and ‘write’
calls behave sensibly.
2024-08-14 23:25:46 -07:00
Paul Eggert
4642cd04ed Avoid strtoul
This is part of the general trend to prefer signed integer types,
to allow better runtime checking with -fsanitize=undefined etc.
* gnulib.modules: Remove strtoul.  Add xstrtoimax.
* src/checkpoint.c (checkpoint, format_checkpoint_string):
* src/system.c (sys_exec_checkpoint_script):
* src/tar.c (checkpoint_option):
Use intmax_t, not unsigned, for checkpoint numbers.
All uses changed.
* src/checkpoint.c (checkpoint_compile_action): Don’t assume
time_t == unsigned long.  Treat overflows as TYPE_MAXIMUM (time_t),
essentially infinity.
* src/tar.c (tar_sparse_major, tar_sparse_minor):
* src/tar.h (struct tar_stat_info):
Use intmax_t, not unsigned, for sparse major and minor.
All uses changed.
* src/tar.c (parse_opt):
Don’t mishandle multiple specifications of sparse major and minor.
* src/transform.c (struct transform):
Use idx_t, not unsigned, for match_number.  All uses changed.
(parse_transform_expr): Don’t mishandle large match numbers
by wrapping them around.
2024-08-14 23:25:45 -07:00
Paul Eggert
cc691f8272 Support >INT_MAX -C dirs
* src/extract.c (struct delayed_set_stat, struct delayed_link):
* src/misc.c (normalize_filename, wd_count, chdir_count)
(chdir_arg, tar_getcdpath):
* src/names.c (name_gather, addname, add_hierarchy_to_namelist):
* src/unlink.c (struct deferred_unlink, flush_deferred_unlinks):
Use idx_t, not int, for directory indexes, so as to not
limit their number to INT_MAX; this is theoretically possible
if -T is used.
* src/names.c (name_next_elt, name_next):
Use bool for boolean.
2024-08-04 01:41:43 -07:00
Paul Eggert
414f635d8b Use ckd_mul, ckd_add in from_header
* src/common.h (LG_64): Remove; no longer used.
* src/list.c (from_header):
Use ckd_mul, ckd_add rather than doing it by hand.
2024-08-04 01:41:43 -07:00
Paul Eggert
ba332e36d0 Use xalignalloc
It ports around issues that our handwritten code does not.
* gnulib.modules: Add xalignalloc.
* src/misc.c (ptr_align, page_aligned_alloc): Remove.
All page_aligned_alloc callers changed to use xalignalloc.
2024-08-04 01:41:43 -07:00
Paul Eggert
61656ef35b Make stripped_prefix_len signed
This is part of the general guideline that signed integer types
are safer.
* src/names.c (stripped_prefix_len): Return ptrdiff_t,
not size_t.  All callers changed.
2024-08-04 01:41:43 -07:00
Paul Eggert
c26111742a Prefer C99 formats like %jd to doing it by hand
It’s now safe to assume support for C99 formats like %jd, so remove
some of the longwinded formatting code put in only to be portable to
pre-C99 platforms.
* gnulib.modules: Add intprops.
* src/buffer.c (format_total_stats, try_new_volume)
(write_volume_label):
* src/checkpoint.c (format_checkpoint_string):
* src/compare.c (verify_volume):
* src/create.c (to_chars_subst, dump_regular_file):
* src/incremen.c (read_num):
* src/list.c (read_and, from_header, simple_print_header)
(print_for_mkdir):
* src/sparse.c (sparse_dump_region):
* src/system.c (dec_to_env, sys_exec_info_script)
(sys_exec_checkpoint_script):
* src/xheader.c (out_of_range_header):
Prefer C99 formats like %jd and %ju to STRINGIFY_BIGINT.
* src/common.h: Sort includes.
Include intprops.h, verify.h.  All other includes of verify.h
removed.
(intmax, uintmax): New functions and macros.
(STRINGIFY_BIGINT): Remove; no longer used.
(TIMESPEC_STRSIZE_BOUND): Make it 1 byte bigger, for negatives.
* src/create.c (MAX_VAL_WITH_DIGITS, to_base256):
Use *_WIDTH macros rather than assuming no padding bits.
Prefer UINTMAX_MAX to (uintmax_t) -1.
* src/list.c (tartime): Use strftime result rather
than running strlen later.
* src/misc.c (timetostr): New function.  Prefer it when
printing time_t values.
2024-08-04 01:41:43 -07:00