Commit Graph

3238 Commits

Author SHA1 Message Date
Paul Eggert
82ef07c9bd Fewer macros in transform.c
* src/transform.c (CASE_CTL_RESET): Remove, replacing with one
instance of code (with a goto, alas).  Still a bit clearer, I think.
2024-08-19 09:57:13 -07:00
Paul Eggert
350cc4077e Fewer macros in tar.c
* src/tar.c (FORMAT_MASK, TAR_SIZE_SUFFIXES, SUBCL_READ)
(SUBCL_WRITE, SUBCL_UPDATE, SUBCL_TEST, SUBCL_OCCUR)
(IS_SUBCOMMAND_CLASS, NS_PRECISION_FORMAT_MASK):
Now constants or (lower-cased) functions, not macros.
(subcommand_class):
Replace hopeful comments with code implementing them.
2024-08-19 09:57:13 -07:00
Paul Eggert
7f557428a4 Fewer macros in tar.h
* src/tar.h (REGTYPE, AREGTYPE, SYMTYPE, BLKTYPE, FIFOTYPE)
(XHDTYPE, XGLTYPE, TSUID, TSGID, TSVTX, TUREAD, TUWRITE, TUEXEC)
(TGREAD, TGWRITE, TGEXEC, TOREAD, TOWRITE, TOEXEC)
(SPARSES_IN_EXTRA_HEADER, SPARSES_IN_OLDGNU_HEADER)
(SPARSES_IN_SPARSE_HEADER, GNUTYPE_DUMPDIR, GNUTYPE_LONGLINK)
(GNUTYPE_LONGNAME, GNUTYPE_MULTIVOL, GNUTYPE_SPARSE)
(GNUTYPE_VOLHDR, SOLARIS_XHDTYPE, SPARSES_IN_STAR_HEADER)
(SPARSES_IN_STAR_EXT_HEADER, BLOCKSIZE):
Now constants, not macros.
2024-08-19 09:57:13 -07:00
Paul Eggert
dd0f95965d Fewer macros in system.c
* src/system.c (PREAD, PWRITE): Now constants, not macros.
2024-08-19 09:57:13 -07:00
Paul Eggert
cdcd1580c8 Fewer macros in names.c
* src/names.c (EXCLUDE_OPTIONS, INCLUDE_OPTIONS):
Now (lowercased) functions, not macros.
(SUCCESSOR): Remove, replacing uses with definiens.
2024-08-19 09:57:13 -07:00
Paul Eggert
dfb1da7253 Fewer macros in incremen.c
* src/incremen.c (DIRF_INIT, DIRF_NFS, DIRF_FOUND, DIRF_NEW)
(DIRF_RENAMED, DIR_IS_INITED, DIR_IS_NFS, DIR_IS_FOUND)
(DIR_IS_RENAMED, DIR_SET_FLAG, DIR_CLEAR_FLAG, NFS_FILE_STAT)
(PD_FORCE_CHILDREN, PD_FORCE_INIT, PD_CHILDREN)
(TAR_INCREMENTAL_VERSION, TEMP_DIR_TEMPLATE):
Now constants or (lowercased) functions, not macros.
(ST_DEV_MSB) [!HAVE_ST_FSTYPE_STRING]: Remove.
Replace only use with something simpler.
2024-08-19 09:57:13 -07:00
Paul Eggert
79cb9aaab6 Fewer macros in extract.c
* src/extract.c (ALL_MODE_BITS, RECOVER_NO, RECOVER_OK)
(RECOVER_SKIP): Now constants or inline functions, not macros.
(maybe_recoverable): Return enum recover, not int.
2024-08-19 09:57:13 -07:00
Paul Eggert
da109fae7a Fewer macros in create.c
* src/create.c (CACHEDIR_SIGNATURE, CACHEDIR_SIGNATURE_SIZE)
(MAX_VAL_WITH_DIGITS, MAX_OCTAL_VAL): Now constants or
inline functions or removed, instead of macros.
(max_octal_val): Accept size rather than type.
2024-08-19 09:57:13 -07:00
Paul Eggert
cc1352699a Fewer macros in buffer.c
* src/buffer.c (READ_ERROR_MAX, NMAGIC, VOL_SUFFIX):
Now constants rather than macros.  Rename NMAGIC to n_zip_magic.
2024-08-19 09:57:13 -07:00
Paul Eggert
4323e98683 Fewer macros in common.h
In common.h, replace macros with constants or functions when that
is easy.  This makes code a bit more reliable (functions evaluate
their args exactly once) and easier to debug (many debugging
environments cannot access macros).
* src/common.h (CHKBLANKS): Remove.  All uses removed.
(NAME_FIELD_SIZE, PREFIX_FIELD_SIZE, UNAME_FIELD_SIZE)
(GNAME_FIELD_SIZE, TAREXIT_SUCCESS, TAREXIT_DIFFERS)
(TAREXIT_FAILURE, LG_8, LG_256, DEFAULT_CHECKPOINT)
(MAX_OLD_FILES, TF_READ, TF_WRITE, TF_DELETED, XFORM_REGFILE)
(XFORM_LINK, XFORM_SYMLINK, XFORM_ALL, WARN_ALONE_ZERO_BLOCK)
(WARN_BAD_DUMPDIR, WARN_CACHEDIR, WARN_CONTIGUOUS_CAST)
(WARN_FILE_CHANGED, WARN_FILE_IGNORED, WARN_FILE_REMOVED)
(WARN_FILE_SHRANK, WARN_FILE_UNCHANGED, WARN_FILENAME_WITH_NULS)
(WARN_IGNORE_ARCHIVE, WARN_IGNORE_NEWER, WARN_NEW_DIRECTORY)
(WARN_RENAME_DIRECTORY, WARN_SYMLINK_CAST, WARN_TIMESTAMP)
(WARN_UNKNOWN_CAST, WARN_UNKNOWN_KEYWORD, WARN_XDEV)
(WARN_DECOMPRESS_PROGRAM, WARN_EXISTING_FILE, WARN_XATTR_WRITE)
(WARN_RECORD_SIZE, WARN_FAILED_READ, WARN_MISSING_ZERO_BLOCKS)
(WARN_VERBOSE_WARNINGS, WARN_ALL, EXCL_DEFAULT, EXCL_RECURSIVE)
(EXCL_NON_RECURSIVE): Now enum constants rather than macros.
(time_option_initialized, isfound, wasfound, warning_enabled):
Now functions rather than macros TIME_OPTION_INITIALIZED, ISFOUND,
WASFOUND, WARNING_ENABLED.  All uses changed.
(OLDER_STAT_TIME, OLDER_TAR_STAT_TIME, EXTRACT_OVER_PIPE)
(TAR_ARGS_INITIALIZER): Remove.  All uses replaced with their
definiens or equivalent.
2024-08-19 09:57:13 -07:00
Paul Eggert
005e345c04 Fix non-ASCII in sparse.c 2024-08-19 09:57:13 -07:00
Paul Eggert
95a5f043c5 Prefer function to COPY_BUF macro
* src/sparse.c (struct ok_n_block_ptr): New type.
(decode_num): Revamp API so that it does the work of both
the old decode_num and the old COPY_BUF.  Always read to the
next newline even if there is a lot of junk in between.
(pax_decode_header): Use the new API.
(COPY_BUF): Remove.
2024-08-19 09:57:13 -07:00
Paul Eggert
f25dd56e83 Prefer function to COPY_STRING macro
* src/sparse.c (struct block_ptr):
New type, to allow a functional style.
(dump_str_nl, floorlog10): New static functions.
(COPY_STRING): Remove.  All uses replaced by dump_str_nl.
(pax_dump_header_1): Use floorlog10 instead of creating a string.
Simplify size calculation.
2024-08-19 09:57:13 -07:00
Paul Eggert
f1e4947992 Fix string size bound calculation
* src/common.h (UINTMAX_STRSIZE_BOUND):
Fix typo that luckily didn’t break anything.
2024-08-19 09:57:13 -07:00
Paul Eggert
0dfcfa4aa4 maint: switch from ERROR to paxerror etc
Prefer functions like ‘paxerror’ to macros like ‘ERROR’.
The functions have cleaner semantics, and calls are
easier to read.
2024-08-19 09:57:13 -07:00
Paul Eggert
e9c16628f0 build: update gnulib and paxutils submodules to latest 2024-08-19 09:57:13 -07:00
Paul Eggert
a0a1243c69 Adjust to verror change for program name
* configure.ac (ENABLE_ERROR_PRINT_PROGNAME):
Adjust to match new Gnulib behavior.
2024-08-15 10:27:47 -07:00
Paul Eggert
812a49419a build: update gnulib and paxutils submodules to latest 2024-08-14 23:25:46 -07:00
Paul Eggert
541f3bc374 Fix duplicate write_error_details decl
* src/common.h (write_error_details): Remove decl
that belongs to paxutils.
2024-08-14 23:25:46 -07:00
Paul Eggert
6bc4c4bf96 Fix minor diagnostic discrepancies in incrementals
* src/incremen.c (read_directory_file, get_gnu_dumpdir):
Be more consistent about fatal errors.
2024-08-14 23:25:46 -07:00
Paul Eggert
ab7a14bd92 Add verror module
* gnulib.modules: Add verror; paxlib will need this.
2024-08-14 23:25:46 -07:00
Paul Eggert
b596676c78 Use idx_t for write_fatal_details size
* src/buffer.c (write_fatal_details): Prefer idx_t to size_t
for object size.
2024-08-14 23:25:46 -07:00
Paul Eggert
15c6010c32 Use intmax_t for read_incr_db_01 line numbers
* src/incremen.c (read_incr_db_01): Don’t assume line numbers
are less than LONG_MAX.
2024-08-14 23:25:46 -07:00
Paul Eggert
43231ae554 Avoid need for base64_init and extra table
Simplify the code by assuming C99 initializers.
* src/list.c (base_64_digits): Remove.
(base64_map): Now a constant.  Now has its (old value + 1) % 65,
as that’s the only easy portable way to do it with a static
initializer (even on platforms where CHAR_BIT != 8); all uses changed.
(base64_init): Remove; only use removed.
(from_header): Adjust to new values in base64_map.

* src/list.c (base_64_digits): Remove; no longer needed.
(base64_map): Now const, initialized statically, and with
invalid entries being 0 not 64, and with valid entries
being 1 greater than before.
2024-08-14 23:25:46 -07:00
Paul Eggert
b201a37421 Remove cast from from_header
* src/list.c (from_header): Reword to avoid a cast
to unsigned char.
2024-08-14 23:25:46 -07:00
Paul Eggert
c9a3abcbe7 Prefer signed to unsigned when decoding options
* src/tar.c (assert_format, decode_options):
Prefer signed to unsigned integers.
(optloc_save): Prefer enum to unsigned integer.
Simplify allocation.
(decode_options): No need to call ngettext for a value known
to be plenty large.
2024-08-14 23:25:46 -07:00
Paul Eggert
18dadeffc0 Don’t assume pid fits in unsigned long
* src/system.c (sys_wait_command): Convert pid_t to intmax_t,
not to unsigned long.
2024-08-14 23:25:46 -07:00
Paul Eggert
1521d3dae0 Avoid casts in tar_checksum
* src/list.c (tar_checksum, from_header):
Recode to avoid casts.
2024-08-14 23:25:46 -07:00
Paul Eggert
5ab90d6c96 Support >UINT_MAX lines in map files
* src/map.c (parse_id, map_read): Prefer intmax_t to unsigned
for line numbers.
2024-08-14 23:25:46 -07:00
Paul Eggert
e137c14285 Prefer signed integer in struct directory
* src/incremen.c (struct directory):
Prefer int to unsigned where either will do.
2024-08-14 23:25:46 -07:00
Paul Eggert
95ebde4303 Simplify make_directory via xizalloc
* src/incremen.c (make_directory): Simplify by using
xizalloc instead of explicit initialization.
2024-08-14 23:25:46 -07:00
Paul Eggert
ef290cb171 Use idx_t, not size_t, for xattr value lengths.
* src/tar.h (struct xattr_map):
* src/xattrs.c (xattr_map_add): Prefer idx_t to size_t.  All uses
changed.
2024-08-14 23:25:46 -07:00
Paul Eggert
09aec02e32 Use intmax_t, not size_t, for input line numbers
This works better on platforms where SIZE_MAX < OFF_MAX.
* src/common.h (struct common locus):
* src/names.c (struct name_elt):
Use intmax_t for line numbers.  All uses changed.
2024-08-14 23:25:46 -07:00
Paul Eggert
9b69d17e24 In short_read, use %td not %lu
* src/buffer.c (short_read): Don’t assume sizes fit
in unsigned long.
2024-08-14 23:25:46 -07:00
Paul Eggert
b3992e4ef8 Prefer signed types in blocking_read etc
* src/compare.c (process_noop, process_rawdata):
Return bool, not int.
* src/compare.c (process_noop, process_rawdata):
* src/create.c (dump_regular_file):
* src/extract.c (extract_file):
* src/misc.c (blocking_read, blocking_write):
* src/sparse.c (sparse_scan_file_raw, sparse_extract_region):
Prefer signed types like idx_t to unsigned ones like size_t.
(sparse_scan_file_raw): Diagnose read errors.
2024-08-14 23:25:46 -07:00
Paul Eggert
88c2aa1616 Fix minor integer overflow in xsparse.c
* scripts/xsparse.c (read_xheader):
Don’t assume size_t fits in unsigned.
Make the version numbers off_t, not just unsigned.
2024-08-14 23:25:46 -07:00
Paul Eggert
d1e72a536f Prefer stoint to strtoul and variants
When parsing numbers prefer using strtosysint (renamed stoint)
to using strtoul and its variants.
This is simpler and faster and likely more reliable than
relying on quirks of the system strtoul etc,
and it standardizes how tar deals with parsing integers.
Among other things, the C standard and POSIX don’t specify
what strtol does to errno when conversions cannot be performed,
and it requires strtoul to support "-" before unsigned numbers.
* gnulib.modules (strtoimax, strtol, strtoumax, xstrtoimax):
Remove.
* src/checkpoint.c (checkpoint_compile_action, getwidth)
(format_checkpoint_string):
* src/incremen.c (read_incr_db_01, read_num)
* src/map.c (parse_id):
* src/misc.c (decode_timespec):
* src/sparse.c (decode_num):
* src/tar.c (parse_owner_group, parse_opt):
* src/transform.c (parse_transform_expr):
* src/xheader.c (decode_record, decode_signed_num)
(sparse_map_decoder):
Prefer stoint to strtol etc.
Don’t rely on errno == EINVAL as the standards don’t guarantee it.
* src/checkpoint.c (getwidth, format_checkpoint_string):
Check for invalid string suffix.
* src/checkpoint.c (getwidth):
Return intmax_t, not long.  All callers changed.
* src/incremen.c (read_directory_file):
It’s just a one-digit number, so just subtract '0'.
* src/map.c (parse_id): Return bool not int.  All callers changed.
* src/misc.c (stoint): Rename from strtosysint, and add
a bool * argument for reporting overflow.  All callers changed.
(decode_timespec): Simplify by using ckd_sub rather than
checking for overflow by hand.
* src/tar.c (incremental_level): Now signed char to
emphasize that it can be only -1, 0, 1.  All uses changed.
* src/xheader.c (decode_record): Avoid giant diagnostics.
2024-08-14 23:25:46 -07:00
Paul Eggert
3ffe2eb073 Handle enormous record sizes better
Formerly the code could misbehave when the user specified a record
size greater than min (INT_MAX * 512 + 511, PTRDIFF_MAX, SSIZE_MAX).
* src/delete.c (new_blocks, delete_archive_members):
* src/system.c (sys_exec_info_script):
* src/tar.c (blocking_factor, record_size):
Don’t limit blocking factor to INT_MAX.
Prefer signed type for record_size.
Do not exceed IDX_MAX or SSIZE_MAX for record_size;
the SSIZE_MAX limit is needed so that ‘read’ and ‘write’
calls behave sensibly.
2024-08-14 23:25:46 -07:00
Paul Eggert
eb9bb9bf80 Default to GNU/Linux dev_t etc
* configure.ac (dev_t, ino_t, major_t, minor_t):
Default to GNU/Linux types.  This shouldn’t affect behavior;
it’s just a cleanup.
2024-08-14 23:25:45 -07:00
Paul Eggert
4642cd04ed Avoid strtoul
This is part of the general trend to prefer signed integer types,
to allow better runtime checking with -fsanitize=undefined etc.
* gnulib.modules: Remove strtoul.  Add xstrtoimax.
* src/checkpoint.c (checkpoint, format_checkpoint_string):
* src/system.c (sys_exec_checkpoint_script):
* src/tar.c (checkpoint_option):
Use intmax_t, not unsigned, for checkpoint numbers.
All uses changed.
* src/checkpoint.c (checkpoint_compile_action): Don’t assume
time_t == unsigned long.  Treat overflows as TYPE_MAXIMUM (time_t),
essentially infinity.
* src/tar.c (tar_sparse_major, tar_sparse_minor):
* src/tar.h (struct tar_stat_info):
Use intmax_t, not unsigned, for sparse major and minor.
All uses changed.
* src/tar.c (parse_opt):
Don’t mishandle multiple specifications of sparse major and minor.
* src/transform.c (struct transform):
Use idx_t, not unsigned, for match_number.  All uses changed.
(parse_transform_expr): Don’t mishandle large match numbers
by wrapping them around.
2024-08-14 23:25:45 -07:00
Paul Eggert
a80f364662 Avoid snprintf
* gnulib.modules: Remove snprintf.
* lib/wordsplit.c (wordsplit_pathexpand):
Do not arbitrarily truncate diagnostic.
(wordsplit_c_quote_copy): Rewrite to avoid the need to
invoke snprintf on a temporary buffer.
2024-08-04 01:41:43 -07:00
Paul Eggert
5316938142 Avoid wordsplit quadratic behavior
* lib/wordsplit.c (wsplt_assign_var):
Avoid unlikely overflow when adding wsp->ws_envidx + n.
Avoid quadratic	behavior when growing ws_envbuf.
2024-08-04 01:41:43 -07:00
Paul Eggert
83926613a4 Prefer ialloc for wordsplit
* lib/wordsplit.c (alloc_space, wsplt_assign_var, expvar)
(wordsplit_tildexpand, wordsplit_pathexpand)
(wordsplit_get_words): Use ialloc API on idx_t args.
2024-08-04 01:41:43 -07:00
Paul Eggert
9a2344b183 Omit wordsplit API that tar doesn’t need
* lib/wordsplit.c: Include <attribute.h> here, not in wordsplit.h.
(WRDSO_ESC_SET, WRDSO_ESC_TEST): Move here from wordsplit.h.
(WORDSPLIT_EXTRAS_extern): New macro.  Used by functions
that tar doesn’t need to be exposed.
(wordsplit_append, wordsplit_c_quoted_length, wsplt_quote_char)
(wordsplit_c_unquote_char, wordsplit_c_quote_char)
(wordsplit_c_quote_copy, wordsplit_get_words, wordsplit_perror):
Omit unless _WORDSPLIT_EXTRAS.
(WORDSPLIT_ENV_INIT): Move here from wordsplit.h, and
make it a constant rather than a macro.
(wordsplit_strerror): Arg is now pointer to const.
* lib/wordsplit.h: Do not include attribute.h, so that library
users need not worry about attribute.h.
(wordsplit_t): Declare only if _WORDSPLIT_EXTRAS.  Similarly for
functions that are not exported to tar.
2024-08-04 01:41:43 -07:00
Paul Eggert
5182462cf1 wordsplit_get_words need not fail
* lib/wordsplit.c (wordsplit_get_words):
Do not fail merely because realloc fails.
Return void, since failure is no longer possible.
2024-08-04 01:41:43 -07:00
Paul Eggert
0ab451a420 More wordsplit int cleanup
* lib/wordsplit.c: Include limits.h.
(_wsplt_subsplit, wordsplit_add_segm, wsnode_quoteremoval)
(wsnode_coalesce, wsnode_tail_coalesce, find_closing_paren)
(expvar, begin_var_p, node_expand, begin_cmd_p, expcmd)
(scan_qstring, scan_word, wordsplit_c_quoted_length)
(wordsplit_string_unquote_copy, wordsplit_c_quote_copy)
(exptab_matches, wordsplit_process_list):
Prefer bool to int.
(wordsplit_init, alloc_space, coalesce_segment)
(wsnode_quoteremoval, wordsplit_finish, wordsplit_append):
Use WRDSE_OK instead of 0 when the context is that of WRDSE_*.
(wsnode_flagstr, coalesce_segment, wsnode_quoteremoval)
(wordsplit_finish, node_split_prefix, wsplt_assign_var, expvar)
(expcmd, wordsplit_tildexpand, wordsplit_pathexpand)
(wsplt_unquote_char, wsplt_quote_char)
(wordsplit_string_unquote_copy):
Prefer '\0' to 0 when it is a char.
(wsnode_insert): Omit last arg, which was always 0.
All callers changed.
(wordsplit_add_segm, node_split_prefix):
Use unsigned, not int, for flag, for consistency.
(wordsplit_finish, begin_var_p, begin_cmd_p, skip_sed_expr)
(xtonum, wsplt_unquote_char, wsplt_quote_char)
(wordsplit_c_unquote_char, wordsplit_c_quote_char)
(wordsplit_c_quote_copy):
Prefer char to int for chars.
(xtonum): Don’t treat "\400" as if it were "\000".
2024-08-04 01:41:43 -07:00
Paul Eggert
dab2830e38 Diagnose argp overflow
* src/names.c (handle_option):
* src/tar.c (parse_default_options):
Report an error if wordsplitting yields more than INT_MAX words,
rather than misbehaving.  argp_parse can’t handle more than
INT_MAX, unfortunately.
2024-08-04 01:41:43 -07:00
Paul Eggert
9cef4d5495 Fix unlikely buffer overrun when checkpointing
* src/checkpoint.c (format_checkpoint_string):
Don’t overrun buffer when word splitting.
2024-08-04 01:41:43 -07:00
Paul Eggert
7cda31b1e0 Prefer idx_t to size_t in wordsplit
* gnulib.modules: Add ialloc.
* lib/wordsplit.c: Include ialloc.h.
(PRINTMAX): New constant.
(printflen, printfdots): New functions.
(wordsplit_dump_nodes, expvar, wordsplit_process_list): Use them.
(_wsplt_subsplit, wordsplit_string_unquote_copy):
Don’t limit lengths to INT_MAX.
(wordsplit_run): Remove.  All callers changed to use wordsplit_len.
(wordsplit_perror): Don’t limit lengths to ULONG_MAX.
* lib/wordsplit.c (wordsplit_init, alloc_space, struct wordsplit_node)
(wsnode_len, wordsplit_add_segm, coalesce_segment, wsnode_quoteremoval)
(wordsplit_finish, wordsplit_append, node_split_prefix)
(find_closing_paren, wordsplit_find_env, wsplt_assign_var)
(expvar, node_expand, expcmd, wordsplit_trimws)
(wordsplit_tildexpand, isglob, wordsplit_pathexpand)
(skip_sed_expr, skip_delim_internal, skip_delim)
(skip_delim_real, scan_qstring, scan_word)
(wordsplit_c_quoted_length, wordsplit_process_list)
(wordsplit_len, wordsplit_free_words, wordsplit_get_words):
* lib/wordsplit.h (struct wordsplit):
Prefer idx_t to size_t for indexes.
* lib/wordsplit.h: Include idx.h.
2024-08-04 01:41:43 -07:00
Paul Eggert
cc691f8272 Support >INT_MAX -C dirs
* src/extract.c (struct delayed_set_stat, struct delayed_link):
* src/misc.c (normalize_filename, wd_count, chdir_count)
(chdir_arg, tar_getcdpath):
* src/names.c (name_gather, addname, add_hierarchy_to_namelist):
* src/unlink.c (struct deferred_unlink, flush_deferred_unlinks):
Use idx_t, not int, for directory indexes, so as to not
limit their number to INT_MAX; this is theoretically possible
if -T is used.
* src/names.c (name_next_elt, name_next):
Use bool for boolean.
2024-08-04 01:41:43 -07:00