3353 Commits

Author SHA1 Message Date
Paul Eggert
88c2aa1616 Fix minor integer overflow in xsparse.c
* scripts/xsparse.c (read_xheader):
Don’t assume size_t fits in unsigned.
Make the version numbers off_t, not just unsigned.
2024-08-14 23:25:46 -07:00
Paul Eggert
d1e72a536f Prefer stoint to strtoul and variants
When parsing numbers prefer using strtosysint (renamed stoint)
to using strtoul and its variants.
This is simpler and faster and likely more reliable than
relying on quirks of the system strtoul etc,
and it standardizes how tar deals with parsing integers.
Among other things, the C standard and POSIX don’t specify
what strtol does to errno when conversions cannot be performed,
and it requires strtoul to support "-" before unsigned numbers.
* gnulib.modules (strtoimax, strtol, strtoumax, xstrtoimax):
Remove.
* src/checkpoint.c (checkpoint_compile_action, getwidth)
(format_checkpoint_string):
* src/incremen.c (read_incr_db_01, read_num)
* src/map.c (parse_id):
* src/misc.c (decode_timespec):
* src/sparse.c (decode_num):
* src/tar.c (parse_owner_group, parse_opt):
* src/transform.c (parse_transform_expr):
* src/xheader.c (decode_record, decode_signed_num)
(sparse_map_decoder):
Prefer stoint to strtol etc.
Don’t rely on errno == EINVAL as the standards don’t guarantee it.
* src/checkpoint.c (getwidth, format_checkpoint_string):
Check for invalid string suffix.
* src/checkpoint.c (getwidth):
Return intmax_t, not long.  All callers changed.
* src/incremen.c (read_directory_file):
It’s just a one-digit number, so just subtract '0'.
* src/map.c (parse_id): Return bool not int.  All callers changed.
* src/misc.c (stoint): Rename from strtosysint, and add
a bool * argument for reporting overflow.  All callers changed.
(decode_timespec): Simplify by using ckd_sub rather than
checking for overflow by hand.
* src/tar.c (incremental_level): Now signed char to
emphasize that it can be only -1, 0, 1.  All uses changed.
* src/xheader.c (decode_record): Avoid giant diagnostics.
2024-08-14 23:25:46 -07:00
Paul Eggert
3ffe2eb073 Handle enormous record sizes better
Formerly the code could misbehave when the user specified a record
size greater than min (INT_MAX * 512 + 511, PTRDIFF_MAX, SSIZE_MAX).
* src/delete.c (new_blocks, delete_archive_members):
* src/system.c (sys_exec_info_script):
* src/tar.c (blocking_factor, record_size):
Don’t limit blocking factor to INT_MAX.
Prefer signed type for record_size.
Do not exceed IDX_MAX or SSIZE_MAX for record_size;
the SSIZE_MAX limit is needed so that ‘read’ and ‘write’
calls behave sensibly.
2024-08-14 23:25:46 -07:00
Paul Eggert
eb9bb9bf80 Default to GNU/Linux dev_t etc
* configure.ac (dev_t, ino_t, major_t, minor_t):
Default to GNU/Linux types.  This shouldn’t affect behavior;
it’s just a cleanup.
2024-08-14 23:25:45 -07:00
Paul Eggert
4642cd04ed Avoid strtoul
This is part of the general trend to prefer signed integer types,
to allow better runtime checking with -fsanitize=undefined etc.
* gnulib.modules: Remove strtoul.  Add xstrtoimax.
* src/checkpoint.c (checkpoint, format_checkpoint_string):
* src/system.c (sys_exec_checkpoint_script):
* src/tar.c (checkpoint_option):
Use intmax_t, not unsigned, for checkpoint numbers.
All uses changed.
* src/checkpoint.c (checkpoint_compile_action): Don’t assume
time_t == unsigned long.  Treat overflows as TYPE_MAXIMUM (time_t),
essentially infinity.
* src/tar.c (tar_sparse_major, tar_sparse_minor):
* src/tar.h (struct tar_stat_info):
Use intmax_t, not unsigned, for sparse major and minor.
All uses changed.
* src/tar.c (parse_opt):
Don’t mishandle multiple specifications of sparse major and minor.
* src/transform.c (struct transform):
Use idx_t, not unsigned, for match_number.  All uses changed.
(parse_transform_expr): Don’t mishandle large match numbers
by wrapping them around.
2024-08-14 23:25:45 -07:00
Paul Eggert
a80f364662 Avoid snprintf
* gnulib.modules: Remove snprintf.
* lib/wordsplit.c (wordsplit_pathexpand):
Do not arbitrarily truncate diagnostic.
(wordsplit_c_quote_copy): Rewrite to avoid the need to
invoke snprintf on a temporary buffer.
2024-08-04 01:41:43 -07:00
Paul Eggert
5316938142 Avoid wordsplit quadratic behavior
* lib/wordsplit.c (wsplt_assign_var):
Avoid unlikely overflow when adding wsp->ws_envidx + n.
Avoid quadratic	behavior when growing ws_envbuf.
2024-08-04 01:41:43 -07:00
Paul Eggert
83926613a4 Prefer ialloc for wordsplit
* lib/wordsplit.c (alloc_space, wsplt_assign_var, expvar)
(wordsplit_tildexpand, wordsplit_pathexpand)
(wordsplit_get_words): Use ialloc API on idx_t args.
2024-08-04 01:41:43 -07:00
Paul Eggert
9a2344b183 Omit wordsplit API that tar doesn’t need
* lib/wordsplit.c: Include <attribute.h> here, not in wordsplit.h.
(WRDSO_ESC_SET, WRDSO_ESC_TEST): Move here from wordsplit.h.
(WORDSPLIT_EXTRAS_extern): New macro.  Used by functions
that tar doesn’t need to be exposed.
(wordsplit_append, wordsplit_c_quoted_length, wsplt_quote_char)
(wordsplit_c_unquote_char, wordsplit_c_quote_char)
(wordsplit_c_quote_copy, wordsplit_get_words, wordsplit_perror):
Omit unless _WORDSPLIT_EXTRAS.
(WORDSPLIT_ENV_INIT): Move here from wordsplit.h, and
make it a constant rather than a macro.
(wordsplit_strerror): Arg is now pointer to const.
* lib/wordsplit.h: Do not include attribute.h, so that library
users need not worry about attribute.h.
(wordsplit_t): Declare only if _WORDSPLIT_EXTRAS.  Similarly for
functions that are not exported to tar.
2024-08-04 01:41:43 -07:00
Paul Eggert
5182462cf1 wordsplit_get_words need not fail
* lib/wordsplit.c (wordsplit_get_words):
Do not fail merely because realloc fails.
Return void, since failure is no longer possible.
2024-08-04 01:41:43 -07:00
Paul Eggert
0ab451a420 More wordsplit int cleanup
* lib/wordsplit.c: Include limits.h.
(_wsplt_subsplit, wordsplit_add_segm, wsnode_quoteremoval)
(wsnode_coalesce, wsnode_tail_coalesce, find_closing_paren)
(expvar, begin_var_p, node_expand, begin_cmd_p, expcmd)
(scan_qstring, scan_word, wordsplit_c_quoted_length)
(wordsplit_string_unquote_copy, wordsplit_c_quote_copy)
(exptab_matches, wordsplit_process_list):
Prefer bool to int.
(wordsplit_init, alloc_space, coalesce_segment)
(wsnode_quoteremoval, wordsplit_finish, wordsplit_append):
Use WRDSE_OK instead of 0 when the context is that of WRDSE_*.
(wsnode_flagstr, coalesce_segment, wsnode_quoteremoval)
(wordsplit_finish, node_split_prefix, wsplt_assign_var, expvar)
(expcmd, wordsplit_tildexpand, wordsplit_pathexpand)
(wsplt_unquote_char, wsplt_quote_char)
(wordsplit_string_unquote_copy):
Prefer '\0' to 0 when it is a char.
(wsnode_insert): Omit last arg, which was always 0.
All callers changed.
(wordsplit_add_segm, node_split_prefix):
Use unsigned, not int, for flag, for consistency.
(wordsplit_finish, begin_var_p, begin_cmd_p, skip_sed_expr)
(xtonum, wsplt_unquote_char, wsplt_quote_char)
(wordsplit_c_unquote_char, wordsplit_c_quote_char)
(wordsplit_c_quote_copy):
Prefer char to int for chars.
(xtonum): Don’t treat "\400" as if it were "\000".
2024-08-04 01:41:43 -07:00
Paul Eggert
dab2830e38 Diagnose argp overflow
* src/names.c (handle_option):
* src/tar.c (parse_default_options):
Report an error if wordsplitting yields more than INT_MAX words,
rather than misbehaving.  argp_parse can’t handle more than
INT_MAX, unfortunately.
2024-08-04 01:41:43 -07:00
Paul Eggert
9cef4d5495 Fix unlikely buffer overrun when checkpointing
* src/checkpoint.c (format_checkpoint_string):
Don’t overrun buffer when word splitting.
2024-08-04 01:41:43 -07:00
Paul Eggert
7cda31b1e0 Prefer idx_t to size_t in wordsplit
* gnulib.modules: Add ialloc.
* lib/wordsplit.c: Include ialloc.h.
(PRINTMAX): New constant.
(printflen, printfdots): New functions.
(wordsplit_dump_nodes, expvar, wordsplit_process_list): Use them.
(_wsplt_subsplit, wordsplit_string_unquote_copy):
Don’t limit lengths to INT_MAX.
(wordsplit_run): Remove.  All callers changed to use wordsplit_len.
(wordsplit_perror): Don’t limit lengths to ULONG_MAX.
* lib/wordsplit.c (wordsplit_init, alloc_space, struct wordsplit_node)
(wsnode_len, wordsplit_add_segm, coalesce_segment, wsnode_quoteremoval)
(wordsplit_finish, wordsplit_append, node_split_prefix)
(find_closing_paren, wordsplit_find_env, wsplt_assign_var)
(expvar, node_expand, expcmd, wordsplit_trimws)
(wordsplit_tildexpand, isglob, wordsplit_pathexpand)
(skip_sed_expr, skip_delim_internal, skip_delim)
(skip_delim_real, scan_qstring, scan_word)
(wordsplit_c_quoted_length, wordsplit_process_list)
(wordsplit_len, wordsplit_free_words, wordsplit_get_words):
* lib/wordsplit.h (struct wordsplit):
Prefer idx_t to size_t for indexes.
* lib/wordsplit.h: Include idx.h.
2024-08-04 01:41:43 -07:00
Paul Eggert
cc691f8272 Support >INT_MAX -C dirs
* src/extract.c (struct delayed_set_stat, struct delayed_link):
* src/misc.c (normalize_filename, wd_count, chdir_count)
(chdir_arg, tar_getcdpath):
* src/names.c (name_gather, addname, add_hierarchy_to_namelist):
* src/unlink.c (struct deferred_unlink, flush_deferred_unlinks):
Use idx_t, not int, for directory indexes, so as to not
limit their number to INT_MAX; this is theoretically possible
if -T is used.
* src/names.c (name_next_elt, name_next):
Use bool for boolean.
2024-08-04 01:41:43 -07:00
Paul Eggert
390950282d maint: fix some encodings and email addresses 2024-08-04 01:41:43 -07:00
Paul Eggert
f13f2d6815 Parse level options more reliably
* src/tar.c (parse_opt): Don’t mishandle out-of-range LEVEL_OPTION.
2024-08-04 01:41:43 -07:00
Paul Eggert
c26c2ea2e9 Minor utf8.c improvements
* src/utf8.c: Minor rephrases for -1.
2024-08-04 01:41:43 -07:00
Paul Eggert
51c841b927 Simplify ST_DEV_MSB
* src/incremen.c (ST_DEV_MSB):
Use TYPE_WIDTH rather than computing it by hand.
2024-08-04 01:41:43 -07:00
Paul Eggert
aca308a778 Use ckd_mul, ckd_add in to_octal, to_base256
* src/create.c (to_octal, to_base256): Simplify.
2024-08-04 01:41:43 -07:00
Paul Eggert
414f635d8b Use ckd_mul, ckd_add in from_header
* src/common.h (LG_64): Remove; no longer used.
* src/list.c (from_header):
Use ckd_mul, ckd_add rather than doing it by hand.
2024-08-04 01:41:43 -07:00
Paul Eggert
281e03ec6c Prefer < 0 to == -1 where either will do
Also, fix an unlikely read overflow in sys_exec_setmtime_script.
* src/buffer.c (open_compressed_archive):
* src/compare.c (verify_volume):
* src/exclist.c (info_attach_exclist):
* src/misc.c (xfork):
* src/sparse.c (sparse_scan_file_seek):
* src/system.c (sys_wait_for_child, sys_spawn_shell)
(wait_for_grandchild, sys_wait_command, sys_exec_info_script)
(sys_exec_checkpoint_script, sys_exec_setmtime_script):
* src/transform.c (_single_transform_name_to_obstack):
* src/xattrs.c (xattrs__acls_set, xattrs_acls_get)
(xattrs_xattrs_get, xattrs__fd_set, xattrs_selinux_get)
(xattrs_selinux_set):
* tests/checkseekhole.c (check_seek_hole, main):
Simplify failure tests by just looking at return value sign.
* src/system.c (sys_exec_setmtime_script):
Don’t assume ‘read’ result fits in int.
(sys_exec_setmtime_script): Don’t reject 1 second before Epoch.
2024-08-04 01:41:43 -07:00
Paul Eggert
9cb1293628 xsparse dry runs should not create output
* scripts/xsparse.c (expand_sparse, main): Check for syscall
failure.  Do not create output file if a dry run.
2024-08-04 01:41:43 -07:00
Paul Eggert
44196e198f Better xsparse outname guessing
* scripts/xsparse.c (guess_outname): Use simpler algorithm,
that doesn’t mishandle outnames like ‘/foo’.
2024-08-04 01:41:43 -07:00
Paul Eggert
ba332e36d0 Use xalignalloc
It ports around issues that our handwritten code does not.
* gnulib.modules: Add xalignalloc.
* src/misc.c (ptr_align, page_aligned_alloc): Remove.
All page_aligned_alloc callers changed to use xalignalloc.
2024-08-04 01:41:43 -07:00
Paul Eggert
61656ef35b Make stripped_prefix_len signed
This is part of the general guideline that signed integer types
are safer.
* src/names.c (stripped_prefix_len): Return ptrdiff_t,
not size_t.  All callers changed.
2024-08-04 01:41:43 -07:00
Paul Eggert
fbc60c2334 from_header minor width cleanup
* src/list.c (from_header): Use UINTMAX_WIDTH rather than
computing it by hand.
2024-08-04 01:41:43 -07:00
Paul Eggert
a78af4b95e Don’t assume mode_t fits in unsigned long
* src/system.c (oct_to_env): Don’t assume mode_t fits in unsigned
long.  Do not output excess leading 1 bits.  When the mode is
zero, generate "0" rather than "00".  Use sprintf instead of
snprintf, since the output won’t be truncated; in general we don’t
use snprintf unless we want output to be truncated and truncation
is typically not GNU style.
2024-08-04 01:41:43 -07:00
Paul Eggert
c26111742a Prefer C99 formats like %jd to doing it by hand
It’s now safe to assume support for C99 formats like %jd, so remove
some of the longwinded formatting code put in only to be portable to
pre-C99 platforms.
* gnulib.modules: Add intprops.
* src/buffer.c (format_total_stats, try_new_volume)
(write_volume_label):
* src/checkpoint.c (format_checkpoint_string):
* src/compare.c (verify_volume):
* src/create.c (to_chars_subst, dump_regular_file):
* src/incremen.c (read_num):
* src/list.c (read_and, from_header, simple_print_header)
(print_for_mkdir):
* src/sparse.c (sparse_dump_region):
* src/system.c (dec_to_env, sys_exec_info_script)
(sys_exec_checkpoint_script):
* src/xheader.c (out_of_range_header):
Prefer C99 formats like %jd and %ju to STRINGIFY_BIGINT.
* src/common.h: Sort includes.
Include intprops.h, verify.h.  All other includes of verify.h
removed.
(intmax, uintmax): New functions and macros.
(STRINGIFY_BIGINT): Remove; no longer used.
(TIMESPEC_STRSIZE_BOUND): Make it 1 byte bigger, for negatives.
* src/create.c (MAX_VAL_WITH_DIGITS, to_base256):
Use *_WIDTH macros rather than assuming no padding bits.
Prefer UINTMAX_MAX to (uintmax_t) -1.
* src/list.c (tartime): Use strftime result rather
than running strlen later.
* src/misc.c (timetostr): New function.  Prefer it when
printing time_t values.
2024-08-04 01:41:43 -07:00
Paul Eggert
6c91bd82e1 Fix unlikely problems with time overflow
Also, fix some rounding errors while we’re in the neighborhood.
* src/buffer.c (duration_ns, compute_duration_ns): Rename from
‘duration’ and ‘compute_duration’, and count ns rather than s, to
lessen rounding error.  All uses changed.
(compute_duration_ns): Work even if the clock moves backward
and time_t is unsigned.
(print_stats): Don’t worry about null or empty TEXT, as that
cannot happen.  Compare double to UINTMAX_MAX + 1.0, not
to UINTMAX_MAX, so that the comparison is exact.
Handle the unlikely case that numbytes >= UINTMAX_MAX.
* src/tar.c (parse_opt): Treat -L hugenumber as effectively
infinity rather than erroring out.
Prefer ckd_add to checking overflow by hand.
2024-08-04 01:41:43 -07:00
Paul Eggert
aae99e863d maint: omit space between "*" and "p" 2024-08-04 01:41:43 -07:00
Paul Eggert
39d315e8ea ptrdiff_t, not int
* src/delete.c (delete_archive_members): Use ptrdiff_t, not int,
to count memory blocks.
(write_recent_bytes): Simplify remainder calculation.
2024-08-04 01:41:43 -07:00
Paul Eggert
bf195d4ae4 ptrdiff_t, not ssize_t
* src/buffer.c (bufmap_reset, _flush_write):
Use ptrdiff_t, not ssize_t, to record pointer differences.
POSIX allows systems where size_t is 64 bits but ssize_t is only 32;
Ultrix used to do that, though no current systems do.
2024-08-04 01:41:43 -07:00
Paul Eggert
a9372cf08a Prefer stdckdint.h to intprops.h
Problem reported by Collin Funk in:
https://lists.gnu.org/r/bug-tar/2024-07/msg00000.html
though this patch is more general than Collin’s suggestion.
* src/compare.c (diff_multivol):
* src/delete.c (move_archive):
* src/sparse.c (oldgnu_add_sparse, pax_decode_header):
* src/system.c (mtioseek):
Prefer ckd_add and ckd_mul to the intprops.h equivalents,
since stdckdint.h is now standard.
2024-08-04 01:41:43 -07:00
Paul Eggert
be1aa32c6d Use ckd_add in page_aligned_alloc
* src/misc.c (page_aligned_alloc): Use ckd_add
instead of doing overflow checking by hand.
2024-08-04 01:41:43 -07:00
Paul Eggert
8a3fc52972 Simplify read_header overflow checking
* src/list.c (read_header): Use ckd_add instead of
doing overflow checking by hand.  Although the old code
was correct on all practical hosts, the new code is simpler
and works even on weird hosts where SIZE_MAX <= INT_MAX.
2024-08-04 01:41:43 -07:00
Paul Eggert
927d67855e Cleaner overflow checking in xheader_read
* src/xheader.c (xheader_read): Prefer ckd_add to
doing overflow checking by hand.
2024-08-04 01:41:43 -07:00
Paul Eggert
c6a5af16ba maint: use static_assert
* gnulib.modules: Add assert-h, for static_assert.
* src/common.h, src/list.c, src/misc.c:
Prefer static_assert to #if + #error.  This doesn’t fix any bugs; it’s
just that in general it’s better to avoid the preprocessor.
2024-08-04 01:41:43 -07:00
Paul Eggert
dcc90722ac Fix tests/ckmtime.c arithmetic
* tests/ckmtime.c (main): Don’t assume time_t is signed.
Avoid integer overflows (quite possible if time_t is 32 bit).
Do calculations precisely, without any rounding errors.
2024-08-04 01:41:43 -07:00
Paul Eggert
7557fdd4df Fix unlikely overflow in utf8_convert
* src/utf8.c (utf8_convert): Check for integer overflow.
2024-08-04 01:41:43 -07:00
Paul Eggert
91ee466c8a Fix unlikely overflow in transform.c
* src/transform.c (_single_transform_name_to_obstack):
Use xinmalloc to check for integer overflow.
2024-08-04 01:41:43 -07:00
Paul Eggert
7079fc369b Better overflow checking for blocking factor
* src/tar.c (parse_opt): Use ckd_add and ckd_mul instead of
less-obvious code that relies on implementation-defined
conversions.
2024-08-04 01:41:43 -07:00
Paul Eggert
b26e798a0f xsparse cleanup, including integer overflow
* scripts/xsparse.c: Include inttypes.h, for strtoimax.
Don’t include stdint.h, since inttypes.h includes it.
Sort include directives.
Make all extern functions and vars static, except for ‘main’.
(string_to_off): Use strtoimax instead of doing overflow
checking by hand, incorrectly (it relied on undefined behavior).
(string_to_size): New arg MAXSIZE.  All callers changed.
(get_var): Return bool not int.  Fix unlikely integer overflow.
Use strncmp instead of memcmp, to avoid unlikely pointer overflow.
(read_xheader, read_map, main): Avoid unlikely integer overflow.
Check for I/O errors more consistently.
(main): Prefer bool to int, and put vars near use.
2024-08-04 01:41:43 -07:00
Paul Eggert
f22b9fe3ce maint: fix some unlikely wordsplit overflows
* gnulib.modules: Add reallocarray.
* lib/wordsplit.c: Include stdckdint.h.
(ISDELIM, expvar, isglob, scan_word):
Defend against strchr (s, 0) always succeeding.
(alloc_space, wsplit_assign_vars):
Fix some unlikely integer overflows, partly by using reallocarray.
(alloc_space): Avoid quadratic worst-case behavior.
(isglob): Return bool, not int.  Accept size_t, not int.
(to_num): Remove; no longer used.
(xtonum): Clarify the code the bit.  Rely on portable
conversion to unsigned char rather than problematic pointer cast.
2024-08-04 01:41:42 -07:00
Paul Eggert
8f094605a8 maint: prefer C23 if available
* gnulib.modules: Add std-gnu23.
2024-07-29 00:41:34 -07:00
Paul Eggert
26d1e4ddbc Add some gnulib.modules
* gnulib.modules: Add errno, limits-h, safe-read, sys_stat.
Not sure about the relationship between gnulib.modules
and paxutils/gnulib.modules, but anyway tar itself uses
these so we should depend on them.  (Perhaps it would be
better if there was just one Gnulib module list for tar;
that would be less confusing.)
2024-07-29 00:41:34 -07:00
Paul Eggert
ec35690e91 build: update gnulib and paxutils submodules to latest 2024-07-29 00:41:34 -07:00
Paul Eggert
3fa1fd0751 Pacify gcc 14 -Wanalyzer-null-argument
* src/tar.c (optloc_eq): Add another ‘assume’.
2024-07-27 00:26:42 -07:00
Paul Eggert
fd33f25989 Pacify gcc 14 -Wanalyzer-infinite-loop
* gnulib.modules: Add stddef, for ‘unreachable’.
* src/compare.c (dumpdir_cmp): Tell GCC that the default case
is unreachable.  Make just one pass through the string,
instead of two passes (one via strcmp, another via strlen).
2024-07-27 00:26:42 -07:00
Paul Eggert
45a86d45b2 maint: make a few funcs and vars static
* src/buffer.c (last_stat_time, write_fatal_details):
* src/tar.c (name_more_files):
* src/xattrs.c (xheader_xattr_add):
Now static.
2024-07-26 23:44:03 -07:00