Commit Graph

3220 Commits

Author SHA1 Message Date
Paul Eggert
541f3bc374 Fix duplicate write_error_details decl
* src/common.h (write_error_details): Remove decl
that belongs to paxutils.
2024-08-14 23:25:46 -07:00
Paul Eggert
6bc4c4bf96 Fix minor diagnostic discrepancies in incrementals
* src/incremen.c (read_directory_file, get_gnu_dumpdir):
Be more consistent about fatal errors.
2024-08-14 23:25:46 -07:00
Paul Eggert
ab7a14bd92 Add verror module
* gnulib.modules: Add verror; paxlib will need this.
2024-08-14 23:25:46 -07:00
Paul Eggert
b596676c78 Use idx_t for write_fatal_details size
* src/buffer.c (write_fatal_details): Prefer idx_t to size_t
for object size.
2024-08-14 23:25:46 -07:00
Paul Eggert
15c6010c32 Use intmax_t for read_incr_db_01 line numbers
* src/incremen.c (read_incr_db_01): Don’t assume line numbers
are less than LONG_MAX.
2024-08-14 23:25:46 -07:00
Paul Eggert
43231ae554 Avoid need for base64_init and extra table
Simplify the code by assuming C99 initializers.
* src/list.c (base_64_digits): Remove.
(base64_map): Now a constant.  Now has its (old value + 1) % 65,
as that’s the only easy portable way to do it with a static
initializer (even on platforms where CHAR_BIT != 8); all uses changed.
(base64_init): Remove; only use removed.
(from_header): Adjust to new values in base64_map.

* src/list.c (base_64_digits): Remove; no longer needed.
(base64_map): Now const, initialized statically, and with
invalid entries being 0 not 64, and with valid entries
being 1 greater than before.
2024-08-14 23:25:46 -07:00
Paul Eggert
b201a37421 Remove cast from from_header
* src/list.c (from_header): Reword to avoid a cast
to unsigned char.
2024-08-14 23:25:46 -07:00
Paul Eggert
c9a3abcbe7 Prefer signed to unsigned when decoding options
* src/tar.c (assert_format, decode_options):
Prefer signed to unsigned integers.
(optloc_save): Prefer enum to unsigned integer.
Simplify allocation.
(decode_options): No need to call ngettext for a value known
to be plenty large.
2024-08-14 23:25:46 -07:00
Paul Eggert
18dadeffc0 Don’t assume pid fits in unsigned long
* src/system.c (sys_wait_command): Convert pid_t to intmax_t,
not to unsigned long.
2024-08-14 23:25:46 -07:00
Paul Eggert
1521d3dae0 Avoid casts in tar_checksum
* src/list.c (tar_checksum, from_header):
Recode to avoid casts.
2024-08-14 23:25:46 -07:00
Paul Eggert
5ab90d6c96 Support >UINT_MAX lines in map files
* src/map.c (parse_id, map_read): Prefer intmax_t to unsigned
for line numbers.
2024-08-14 23:25:46 -07:00
Paul Eggert
e137c14285 Prefer signed integer in struct directory
* src/incremen.c (struct directory):
Prefer int to unsigned where either will do.
2024-08-14 23:25:46 -07:00
Paul Eggert
95ebde4303 Simplify make_directory via xizalloc
* src/incremen.c (make_directory): Simplify by using
xizalloc instead of explicit initialization.
2024-08-14 23:25:46 -07:00
Paul Eggert
ef290cb171 Use idx_t, not size_t, for xattr value lengths.
* src/tar.h (struct xattr_map):
* src/xattrs.c (xattr_map_add): Prefer idx_t to size_t.  All uses
changed.
2024-08-14 23:25:46 -07:00
Paul Eggert
09aec02e32 Use intmax_t, not size_t, for input line numbers
This works better on platforms where SIZE_MAX < OFF_MAX.
* src/common.h (struct common locus):
* src/names.c (struct name_elt):
Use intmax_t for line numbers.  All uses changed.
2024-08-14 23:25:46 -07:00
Paul Eggert
9b69d17e24 In short_read, use %td not %lu
* src/buffer.c (short_read): Don’t assume sizes fit
in unsigned long.
2024-08-14 23:25:46 -07:00
Paul Eggert
b3992e4ef8 Prefer signed types in blocking_read etc
* src/compare.c (process_noop, process_rawdata):
Return bool, not int.
* src/compare.c (process_noop, process_rawdata):
* src/create.c (dump_regular_file):
* src/extract.c (extract_file):
* src/misc.c (blocking_read, blocking_write):
* src/sparse.c (sparse_scan_file_raw, sparse_extract_region):
Prefer signed types like idx_t to unsigned ones like size_t.
(sparse_scan_file_raw): Diagnose read errors.
2024-08-14 23:25:46 -07:00
Paul Eggert
88c2aa1616 Fix minor integer overflow in xsparse.c
* scripts/xsparse.c (read_xheader):
Don’t assume size_t fits in unsigned.
Make the version numbers off_t, not just unsigned.
2024-08-14 23:25:46 -07:00
Paul Eggert
d1e72a536f Prefer stoint to strtoul and variants
When parsing numbers prefer using strtosysint (renamed stoint)
to using strtoul and its variants.
This is simpler and faster and likely more reliable than
relying on quirks of the system strtoul etc,
and it standardizes how tar deals with parsing integers.
Among other things, the C standard and POSIX don’t specify
what strtol does to errno when conversions cannot be performed,
and it requires strtoul to support "-" before unsigned numbers.
* gnulib.modules (strtoimax, strtol, strtoumax, xstrtoimax):
Remove.
* src/checkpoint.c (checkpoint_compile_action, getwidth)
(format_checkpoint_string):
* src/incremen.c (read_incr_db_01, read_num)
* src/map.c (parse_id):
* src/misc.c (decode_timespec):
* src/sparse.c (decode_num):
* src/tar.c (parse_owner_group, parse_opt):
* src/transform.c (parse_transform_expr):
* src/xheader.c (decode_record, decode_signed_num)
(sparse_map_decoder):
Prefer stoint to strtol etc.
Don’t rely on errno == EINVAL as the standards don’t guarantee it.
* src/checkpoint.c (getwidth, format_checkpoint_string):
Check for invalid string suffix.
* src/checkpoint.c (getwidth):
Return intmax_t, not long.  All callers changed.
* src/incremen.c (read_directory_file):
It’s just a one-digit number, so just subtract '0'.
* src/map.c (parse_id): Return bool not int.  All callers changed.
* src/misc.c (stoint): Rename from strtosysint, and add
a bool * argument for reporting overflow.  All callers changed.
(decode_timespec): Simplify by using ckd_sub rather than
checking for overflow by hand.
* src/tar.c (incremental_level): Now signed char to
emphasize that it can be only -1, 0, 1.  All uses changed.
* src/xheader.c (decode_record): Avoid giant diagnostics.
2024-08-14 23:25:46 -07:00
Paul Eggert
3ffe2eb073 Handle enormous record sizes better
Formerly the code could misbehave when the user specified a record
size greater than min (INT_MAX * 512 + 511, PTRDIFF_MAX, SSIZE_MAX).
* src/delete.c (new_blocks, delete_archive_members):
* src/system.c (sys_exec_info_script):
* src/tar.c (blocking_factor, record_size):
Don’t limit blocking factor to INT_MAX.
Prefer signed type for record_size.
Do not exceed IDX_MAX or SSIZE_MAX for record_size;
the SSIZE_MAX limit is needed so that ‘read’ and ‘write’
calls behave sensibly.
2024-08-14 23:25:46 -07:00
Paul Eggert
eb9bb9bf80 Default to GNU/Linux dev_t etc
* configure.ac (dev_t, ino_t, major_t, minor_t):
Default to GNU/Linux types.  This shouldn’t affect behavior;
it’s just a cleanup.
2024-08-14 23:25:45 -07:00
Paul Eggert
4642cd04ed Avoid strtoul
This is part of the general trend to prefer signed integer types,
to allow better runtime checking with -fsanitize=undefined etc.
* gnulib.modules: Remove strtoul.  Add xstrtoimax.
* src/checkpoint.c (checkpoint, format_checkpoint_string):
* src/system.c (sys_exec_checkpoint_script):
* src/tar.c (checkpoint_option):
Use intmax_t, not unsigned, for checkpoint numbers.
All uses changed.
* src/checkpoint.c (checkpoint_compile_action): Don’t assume
time_t == unsigned long.  Treat overflows as TYPE_MAXIMUM (time_t),
essentially infinity.
* src/tar.c (tar_sparse_major, tar_sparse_minor):
* src/tar.h (struct tar_stat_info):
Use intmax_t, not unsigned, for sparse major and minor.
All uses changed.
* src/tar.c (parse_opt):
Don’t mishandle multiple specifications of sparse major and minor.
* src/transform.c (struct transform):
Use idx_t, not unsigned, for match_number.  All uses changed.
(parse_transform_expr): Don’t mishandle large match numbers
by wrapping them around.
2024-08-14 23:25:45 -07:00
Paul Eggert
a80f364662 Avoid snprintf
* gnulib.modules: Remove snprintf.
* lib/wordsplit.c (wordsplit_pathexpand):
Do not arbitrarily truncate diagnostic.
(wordsplit_c_quote_copy): Rewrite to avoid the need to
invoke snprintf on a temporary buffer.
2024-08-04 01:41:43 -07:00
Paul Eggert
5316938142 Avoid wordsplit quadratic behavior
* lib/wordsplit.c (wsplt_assign_var):
Avoid unlikely overflow when adding wsp->ws_envidx + n.
Avoid quadratic	behavior when growing ws_envbuf.
2024-08-04 01:41:43 -07:00
Paul Eggert
83926613a4 Prefer ialloc for wordsplit
* lib/wordsplit.c (alloc_space, wsplt_assign_var, expvar)
(wordsplit_tildexpand, wordsplit_pathexpand)
(wordsplit_get_words): Use ialloc API on idx_t args.
2024-08-04 01:41:43 -07:00
Paul Eggert
9a2344b183 Omit wordsplit API that tar doesn’t need
* lib/wordsplit.c: Include <attribute.h> here, not in wordsplit.h.
(WRDSO_ESC_SET, WRDSO_ESC_TEST): Move here from wordsplit.h.
(WORDSPLIT_EXTRAS_extern): New macro.  Used by functions
that tar doesn’t need to be exposed.
(wordsplit_append, wordsplit_c_quoted_length, wsplt_quote_char)
(wordsplit_c_unquote_char, wordsplit_c_quote_char)
(wordsplit_c_quote_copy, wordsplit_get_words, wordsplit_perror):
Omit unless _WORDSPLIT_EXTRAS.
(WORDSPLIT_ENV_INIT): Move here from wordsplit.h, and
make it a constant rather than a macro.
(wordsplit_strerror): Arg is now pointer to const.
* lib/wordsplit.h: Do not include attribute.h, so that library
users need not worry about attribute.h.
(wordsplit_t): Declare only if _WORDSPLIT_EXTRAS.  Similarly for
functions that are not exported to tar.
2024-08-04 01:41:43 -07:00
Paul Eggert
5182462cf1 wordsplit_get_words need not fail
* lib/wordsplit.c (wordsplit_get_words):
Do not fail merely because realloc fails.
Return void, since failure is no longer possible.
2024-08-04 01:41:43 -07:00
Paul Eggert
0ab451a420 More wordsplit int cleanup
* lib/wordsplit.c: Include limits.h.
(_wsplt_subsplit, wordsplit_add_segm, wsnode_quoteremoval)
(wsnode_coalesce, wsnode_tail_coalesce, find_closing_paren)
(expvar, begin_var_p, node_expand, begin_cmd_p, expcmd)
(scan_qstring, scan_word, wordsplit_c_quoted_length)
(wordsplit_string_unquote_copy, wordsplit_c_quote_copy)
(exptab_matches, wordsplit_process_list):
Prefer bool to int.
(wordsplit_init, alloc_space, coalesce_segment)
(wsnode_quoteremoval, wordsplit_finish, wordsplit_append):
Use WRDSE_OK instead of 0 when the context is that of WRDSE_*.
(wsnode_flagstr, coalesce_segment, wsnode_quoteremoval)
(wordsplit_finish, node_split_prefix, wsplt_assign_var, expvar)
(expcmd, wordsplit_tildexpand, wordsplit_pathexpand)
(wsplt_unquote_char, wsplt_quote_char)
(wordsplit_string_unquote_copy):
Prefer '\0' to 0 when it is a char.
(wsnode_insert): Omit last arg, which was always 0.
All callers changed.
(wordsplit_add_segm, node_split_prefix):
Use unsigned, not int, for flag, for consistency.
(wordsplit_finish, begin_var_p, begin_cmd_p, skip_sed_expr)
(xtonum, wsplt_unquote_char, wsplt_quote_char)
(wordsplit_c_unquote_char, wordsplit_c_quote_char)
(wordsplit_c_quote_copy):
Prefer char to int for chars.
(xtonum): Don’t treat "\400" as if it were "\000".
2024-08-04 01:41:43 -07:00
Paul Eggert
dab2830e38 Diagnose argp overflow
* src/names.c (handle_option):
* src/tar.c (parse_default_options):
Report an error if wordsplitting yields more than INT_MAX words,
rather than misbehaving.  argp_parse can’t handle more than
INT_MAX, unfortunately.
2024-08-04 01:41:43 -07:00
Paul Eggert
9cef4d5495 Fix unlikely buffer overrun when checkpointing
* src/checkpoint.c (format_checkpoint_string):
Don’t overrun buffer when word splitting.
2024-08-04 01:41:43 -07:00
Paul Eggert
7cda31b1e0 Prefer idx_t to size_t in wordsplit
* gnulib.modules: Add ialloc.
* lib/wordsplit.c: Include ialloc.h.
(PRINTMAX): New constant.
(printflen, printfdots): New functions.
(wordsplit_dump_nodes, expvar, wordsplit_process_list): Use them.
(_wsplt_subsplit, wordsplit_string_unquote_copy):
Don’t limit lengths to INT_MAX.
(wordsplit_run): Remove.  All callers changed to use wordsplit_len.
(wordsplit_perror): Don’t limit lengths to ULONG_MAX.
* lib/wordsplit.c (wordsplit_init, alloc_space, struct wordsplit_node)
(wsnode_len, wordsplit_add_segm, coalesce_segment, wsnode_quoteremoval)
(wordsplit_finish, wordsplit_append, node_split_prefix)
(find_closing_paren, wordsplit_find_env, wsplt_assign_var)
(expvar, node_expand, expcmd, wordsplit_trimws)
(wordsplit_tildexpand, isglob, wordsplit_pathexpand)
(skip_sed_expr, skip_delim_internal, skip_delim)
(skip_delim_real, scan_qstring, scan_word)
(wordsplit_c_quoted_length, wordsplit_process_list)
(wordsplit_len, wordsplit_free_words, wordsplit_get_words):
* lib/wordsplit.h (struct wordsplit):
Prefer idx_t to size_t for indexes.
* lib/wordsplit.h: Include idx.h.
2024-08-04 01:41:43 -07:00
Paul Eggert
cc691f8272 Support >INT_MAX -C dirs
* src/extract.c (struct delayed_set_stat, struct delayed_link):
* src/misc.c (normalize_filename, wd_count, chdir_count)
(chdir_arg, tar_getcdpath):
* src/names.c (name_gather, addname, add_hierarchy_to_namelist):
* src/unlink.c (struct deferred_unlink, flush_deferred_unlinks):
Use idx_t, not int, for directory indexes, so as to not
limit their number to INT_MAX; this is theoretically possible
if -T is used.
* src/names.c (name_next_elt, name_next):
Use bool for boolean.
2024-08-04 01:41:43 -07:00
Paul Eggert
390950282d maint: fix some encodings and email addresses 2024-08-04 01:41:43 -07:00
Paul Eggert
f13f2d6815 Parse level options more reliably
* src/tar.c (parse_opt): Don’t mishandle out-of-range LEVEL_OPTION.
2024-08-04 01:41:43 -07:00
Paul Eggert
c26c2ea2e9 Minor utf8.c improvements
* src/utf8.c: Minor rephrases for -1.
2024-08-04 01:41:43 -07:00
Paul Eggert
51c841b927 Simplify ST_DEV_MSB
* src/incremen.c (ST_DEV_MSB):
Use TYPE_WIDTH rather than computing it by hand.
2024-08-04 01:41:43 -07:00
Paul Eggert
aca308a778 Use ckd_mul, ckd_add in to_octal, to_base256
* src/create.c (to_octal, to_base256): Simplify.
2024-08-04 01:41:43 -07:00
Paul Eggert
414f635d8b Use ckd_mul, ckd_add in from_header
* src/common.h (LG_64): Remove; no longer used.
* src/list.c (from_header):
Use ckd_mul, ckd_add rather than doing it by hand.
2024-08-04 01:41:43 -07:00
Paul Eggert
281e03ec6c Prefer < 0 to == -1 where either will do
Also, fix an unlikely read overflow in sys_exec_setmtime_script.
* src/buffer.c (open_compressed_archive):
* src/compare.c (verify_volume):
* src/exclist.c (info_attach_exclist):
* src/misc.c (xfork):
* src/sparse.c (sparse_scan_file_seek):
* src/system.c (sys_wait_for_child, sys_spawn_shell)
(wait_for_grandchild, sys_wait_command, sys_exec_info_script)
(sys_exec_checkpoint_script, sys_exec_setmtime_script):
* src/transform.c (_single_transform_name_to_obstack):
* src/xattrs.c (xattrs__acls_set, xattrs_acls_get)
(xattrs_xattrs_get, xattrs__fd_set, xattrs_selinux_get)
(xattrs_selinux_set):
* tests/checkseekhole.c (check_seek_hole, main):
Simplify failure tests by just looking at return value sign.
* src/system.c (sys_exec_setmtime_script):
Don’t assume ‘read’ result fits in int.
(sys_exec_setmtime_script): Don’t reject 1 second before Epoch.
2024-08-04 01:41:43 -07:00
Paul Eggert
9cb1293628 xsparse dry runs should not create output
* scripts/xsparse.c (expand_sparse, main): Check for syscall
failure.  Do not create output file if a dry run.
2024-08-04 01:41:43 -07:00
Paul Eggert
44196e198f Better xsparse outname guessing
* scripts/xsparse.c (guess_outname): Use simpler algorithm,
that doesn’t mishandle outnames like ‘/foo’.
2024-08-04 01:41:43 -07:00
Paul Eggert
ba332e36d0 Use xalignalloc
It ports around issues that our handwritten code does not.
* gnulib.modules: Add xalignalloc.
* src/misc.c (ptr_align, page_aligned_alloc): Remove.
All page_aligned_alloc callers changed to use xalignalloc.
2024-08-04 01:41:43 -07:00
Paul Eggert
61656ef35b Make stripped_prefix_len signed
This is part of the general guideline that signed integer types
are safer.
* src/names.c (stripped_prefix_len): Return ptrdiff_t,
not size_t.  All callers changed.
2024-08-04 01:41:43 -07:00
Paul Eggert
fbc60c2334 from_header minor width cleanup
* src/list.c (from_header): Use UINTMAX_WIDTH rather than
computing it by hand.
2024-08-04 01:41:43 -07:00
Paul Eggert
a78af4b95e Don’t assume mode_t fits in unsigned long
* src/system.c (oct_to_env): Don’t assume mode_t fits in unsigned
long.  Do not output excess leading 1 bits.  When the mode is
zero, generate "0" rather than "00".  Use sprintf instead of
snprintf, since the output won’t be truncated; in general we don’t
use snprintf unless we want output to be truncated and truncation
is typically not GNU style.
2024-08-04 01:41:43 -07:00
Paul Eggert
c26111742a Prefer C99 formats like %jd to doing it by hand
It’s now safe to assume support for C99 formats like %jd, so remove
some of the longwinded formatting code put in only to be portable to
pre-C99 platforms.
* gnulib.modules: Add intprops.
* src/buffer.c (format_total_stats, try_new_volume)
(write_volume_label):
* src/checkpoint.c (format_checkpoint_string):
* src/compare.c (verify_volume):
* src/create.c (to_chars_subst, dump_regular_file):
* src/incremen.c (read_num):
* src/list.c (read_and, from_header, simple_print_header)
(print_for_mkdir):
* src/sparse.c (sparse_dump_region):
* src/system.c (dec_to_env, sys_exec_info_script)
(sys_exec_checkpoint_script):
* src/xheader.c (out_of_range_header):
Prefer C99 formats like %jd and %ju to STRINGIFY_BIGINT.
* src/common.h: Sort includes.
Include intprops.h, verify.h.  All other includes of verify.h
removed.
(intmax, uintmax): New functions and macros.
(STRINGIFY_BIGINT): Remove; no longer used.
(TIMESPEC_STRSIZE_BOUND): Make it 1 byte bigger, for negatives.
* src/create.c (MAX_VAL_WITH_DIGITS, to_base256):
Use *_WIDTH macros rather than assuming no padding bits.
Prefer UINTMAX_MAX to (uintmax_t) -1.
* src/list.c (tartime): Use strftime result rather
than running strlen later.
* src/misc.c (timetostr): New function.  Prefer it when
printing time_t values.
2024-08-04 01:41:43 -07:00
Paul Eggert
6c91bd82e1 Fix unlikely problems with time overflow
Also, fix some rounding errors while we’re in the neighborhood.
* src/buffer.c (duration_ns, compute_duration_ns): Rename from
‘duration’ and ‘compute_duration’, and count ns rather than s, to
lessen rounding error.  All uses changed.
(compute_duration_ns): Work even if the clock moves backward
and time_t is unsigned.
(print_stats): Don’t worry about null or empty TEXT, as that
cannot happen.  Compare double to UINTMAX_MAX + 1.0, not
to UINTMAX_MAX, so that the comparison is exact.
Handle the unlikely case that numbytes >= UINTMAX_MAX.
* src/tar.c (parse_opt): Treat -L hugenumber as effectively
infinity rather than erroring out.
Prefer ckd_add to checking overflow by hand.
2024-08-04 01:41:43 -07:00
Paul Eggert
aae99e863d maint: omit space between "*" and "p" 2024-08-04 01:41:43 -07:00
Paul Eggert
39d315e8ea ptrdiff_t, not int
* src/delete.c (delete_archive_members): Use ptrdiff_t, not int,
to count memory blocks.
(write_recent_bytes): Simplify remainder calculation.
2024-08-04 01:41:43 -07:00
Paul Eggert
bf195d4ae4 ptrdiff_t, not ssize_t
* src/buffer.c (bufmap_reset, _flush_write):
Use ptrdiff_t, not ssize_t, to record pointer differences.
POSIX allows systems where size_t is 64 bits but ssize_t is only 32;
Ultrix used to do that, though no current systems do.
2024-08-04 01:41:43 -07:00