The scaffolding required to have a working scylla tool app, is considerable, leading to a large amount of boilerplate code in each such app. This logic is also very similar across the two tool apps we have and would presumably be very similar in any future app. This PR extracts this logic into `tools/utils.hh` and introduces `tool_app_template`, which is similar to `seastar::app_template` in that it centralizes all the option handling and more in a single class, that each tool has to just instantiate and then call `run()` to run the app.
This cuts down on the repetition and boilerplate in our current tool apps and make prototyping new tool apps much easier.
Closes#14855
* github.com:scylladb/scylladb:
tools/utils.hh: remove unused headers
tools/utils: make get_selected_operation() and configure_tool_mode() private
tools/utils.hh: de-template get_selected_operation()
tools/scylla-types: migrate to tools_app_template
tools/scylla-types: prepare for migration to tool_app_template
tools/scylla-sstable.cc: fix indentation
tools/scylla-sstables: migrate to tool_app_template
tools/scylla-sstables: prepare for migration to tool_app_template
tools: extract tool app skeleton to utils.hh
This refreshes clang to 16.0.6 and libstdc++ to 13.1.1.
compiler-rt, libasan, and libubsan are added to install-dependencies.sh
since they are no longer pulled in as depdendencies.
Closes#13730
It now has a single user, so it doesn't have to be a template.
For now, make the method inline, so it can stay in the header. It will
be moved to utils.cc in the next patch.
The skeleton of the two existing scylla-native tools (scylla-types and
scylla-sstable) is very similar. By skeleton, I mean all the boilerplate
around creating and configuring a seastar::app_template, representing
operations/command and their options, and presenting and selecting
these.
To facilitate code-sharing and quick development of any new tools,
extract this skeleton from scylla-sstable.cc into tools/utils.hh,
in the form of a new tool_app_template, which wraps a
seastar::app_template and centralizes all the boilerplate logic in a
single place. The extracted code is not a simple copy-paste, although
many elements are simply copied. The original code is not removed yet.
Schema digest is calculated by querying for mutations of all schema
tables, then compacting them so that all tombstones in them are
dropped. However, even if the mutation becomes empty after compaction,
we still feed its partition key. If the same mutations were compacted
prior to the query, because the tombstones expire, we won't get any
mutation at all and won't feed the partition key. So schema digest
will change once an empty partition of some schema table is compacted
away.
Tombstones expire 7 days after schema change which introduces them. If
one of the nodes is restarted after that, it will compute a different
table schema digest on boot. This may cause performance problems. When
sending a request from coordinator to replica, the replica needs
schema_ptr of exact schema version request by the coordinator. If it
doesn't know that version, it will request it from the coordinator and
perform a full schema merge. This adds latency to every such request.
Schema versions which are not referenced are currently kept in cache
for only 1 second, so if request flow has low-enough rate, this
situation results in perpetual schema pulls.
After ae8d2a550d (5.2.0), it is more liekly to
run into this situation, because table creation generates tombstones
for all schema tables relevant to the table, even the ones which
will be otherwise empty for the new table (e.g. computed_columns).
This change inroduces a cluster feature which when enabled will change
digest calculation to be insensitive to expiry by ignoring empty
partitions in digest calculation. When the feature is enabled,
schema_ptrs are reloaded so that the window of discrepancy during
transition is short and no rolling restart is required.
A similar problem was fixed for per-node digest calculation in
c2ba94dc39e4add9db213751295fb17b95e6b962. Per-table digest calculation
was not fixed at that time because we didn't persist enabled features
and they were not enabled early-enough on boot for us to depend on
them in digest calculation. Now they are enabled before non-system
tables are loaded so digest calculation can rely on cluster features.
Fixes#4485.
Manually tested using ccm on cluster upgrade scenarios and node restarts.
Closes#14441
* github.com:scylladb/scylladb:
test: schema_change_test: Verify digests also with TABLE_DIGEST_INSENSITIVE_TO_EXPIRY enabled
schema_mutations, migration_manager: Ignore empty partitions in per-table digest
migration_manager, schema_tables: Implement migration_manager::reload_schema()
schema_tables: Avoid crashing when table selector has only one kind of tables
this series enables CMake to build submodules. it helps developers to build, for instance, the java tools on demand.
Closes#14751
* github.com:scylladb/scylladb:
build: cmake: build submodules
build: cmake: generate version files with add_custom_command()
Soon, we will want to convert mutation fragments into json inside the
scylla codebase, not just in tools. To avoid scylla-core code having to
include tools/ (and link against it), move the low-level json utilities
into mutation/.
The values of cells are potentially very large and thus, when presenting
row content as json in SELECT * FROM MUTATION_FRAGMENTS($table) queries,
we want to separate metadata and cell values into separate columns, so
users can opt out from the potentially big values being included too.
To support this use-case, write(row) and its downstream write methods
get a new `include_value` flag, which defaults to true. When set to
false, cell values will not be included in the json output. At the same
time, new methods are added to convert only cell values of a row to
json.
1) mutation_partition_json_writer - containing all the low level
utilities for converting sub-fragment level mutation components (such
as rows, tombstones, etc.) and their components into json;
2) mutation_fragment_stream_json_writer - containing all the high level
logic for converting mutation fragment streams to json;
The latter using the former behind the scenes. The goal is to enable
reuse of converting mutation-fragments into json, without being forced
to work around differences in how the mutation fragments are reprenented
in json, on the higher level.
this mirrors what we have in the `build.ninja` generated by
`configure.py`. with this change, we can build for instance,
`dist-tool-tar` from the `build.ninja` generated by CMake.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
This is the first phase of providing strong exception safety guarantees by the generic `compaction_backlog_tracker::replace_sstables`.
Once all compaction strategies backlog trackers' replace_sstables provide strong exception safety guarantees (i.e. they may throw an exception but must revert on error any intermediate changes they made to restore the tracker to the pre-update state).
Once this series is merged and ICS replace_sstables is also made strongly exception safe (using infrastructure from size_tiered_backlog_tracker introduced here), `compaction_backlog_tracker::replace_sstables` may allow exceptions to propagate back to the caller rather than disabling the backlog tracker on errors.
Closes#14104
* github.com:scylladb/scylladb:
leveled_compaction_backlog_tracker: replace_sstables: provide strong exception safety guarantees
time_window_backlog_tracker: replace_sstables: provide strong exception safety guarantees
size_tiered_backlog_tracker: replace_sstables: provide strong exception safety guarantees
size_tiered_backlog_tracker: provide static calculate_sstables_backlog_contribution
size_tiered_backlog_tracker: make log4 helper static
size_tiered_backlog_tracker: define struct sstables_backlog_contribution
size_tiered_backlog_tracker: update_sstables: update total_bytes only if set changed
compaction_backlog_tracker: replace_sstables: pass old and new sstables vectors by ref
compaction_backlog_tracker: replace_sstables: add FIXME comments about strong exception safety
Schema digest is calculated by querying for mutations of all schema
tables, then compacting them so that all tombstones in them are
dropped. However, even if the mutation becomes empty after compaction,
we still feed its partition key. If the same mutations were compacted
prior to the query, because the tombstones expire, we won't get any
mutation at all and won't feed the partition key. So schema digest
will change once an empty partition of some schema table is compacted
away.
Tombstones expire 7 days after schema change which introduces them. If
one of the nodes is restarted after that, it will compute a different
table schema digest on boot. This may cause performance problems. When
sending a request from coordinator to replica, the replica needs
schema_ptr of exact schema version request by the coordinator. If it
doesn't know that version, it will request it from the coordinator and
perform a full schema merge. This adds latency to every such request.
Schema versions which are not referenced are currently kept in cache
for only 1 second, so if request flow has low-enough rate, this
situation results in perpetual schema pulls.
After ae8d2a550d, it is more liekly to
run into this situation, because table creation generates tombstones
for all schema tables relevant to the table, even the ones which
will be otherwise empty for the new table (e.g. computed_columns).
This change inroduces a cluster feature which when enabled will change
digest calculation to be insensitive to expiry by ignoring empty
partitions in digest calculation. When the feature is enabled,
schema_ptrs are reloaded so that the window of discrepancy during
transition is short and no rolling restart is required.
A similar problem was fixed for per-node digest calculation in
18f484cc753d17d1e3658bcb5c73ed8f319d32e8. Per-table digest calculation
was not fixed at that time because we didn't persist enabled features
and they were not enabled early-enough on boot for us to depend on
them in digest calculation. Now they are enabled before non-system
tables are loaded so digest calculation can rely on cluster features.
Fixes#4485.
db::config is pretty large (~32k) and there are four of them, blowing the stack. Fix by
allocating them on the heap.
It's not clear why this shows up on my system (clang 16) and not in the frozen toolchain.
Perhaps clang 16 is less able to reuse stack space.
Closes#14464
schema::get_sharder() does not use the correct sharder for
tablet-based tables. Code which is supposed to work with all kinds of
tables should obtain the sharder from erm::get_sharder().
Exposing scrub compaction to the command-line. Allows for offline scrub of sstables, in cases where online scrubbing (via scylla itself) is not possible or not desired. One such case recently was an sstable from a backup which turned out to be corrupt, `nodetool refresh --load-and-stream` refusing to load it.
Fixes: #14203Closes#14260
* github.com:scylladb/scylladb:
docs/operating-scylla/admin-tools: scylla-sstable: document scrub operation
test/cql-pytest: test_tools.py: add test for scylla sstable scrub
tools/scylla-sstable: add scrub operation
tools/scylla-sstable: write operation: add none to valid validation levels
tools/scylla-sstable: handle errors thrown by the operation
test/cql-pytest: add option to omit scylla's output from the test output
tools/scylla-sstable: s/option/operation_option/
tool/scylla-sstable: add missing comments
Exposing scrub compaction to the command-line. Scrubbed sstables are
written into a directory specified by the `--output-directory` command
line parameter. This directory is expected to be empty, to avoid
clashes with any pre-existing sstables. This can be overriden by the
user if they wish.
Instead of letting the runtime catch them. Also, make sure all exception
throw due to bad arguments are instances of `std::invalid_argument`,
these are now reported differently from other, runtime errors.
Remove the now extraneous `error:` prefix from all exception messages.
In that level no io_priority_class-es exist. Instead, all the IO happens
in the context of current sched-group. File API no longer accepts prio
class argument (and makes io_intent arg mandatory to impls).
So the change consists of
- removing all usage of io_priority_class
- patching file_impl's inheritants to updated API
- priority manager goes away altogether
- IO bandwidth update is performed on respective sched group
- tune-up scylla-gdb.py io_queues command
The first change is huge and was made semi-autimatically by:
- grep io_priority_class | default_priority_class
- remove all calls, found methods' args and class' fields
Patching file_impl-s is smaller, but also mechanical:
- replace io_priority_class& argument with io_intent* one
- pass intent to lower file (if applicatble)
Dropping the priority manager is:
- git-rm .cc and .hh
- sed out all the #include-s
- fix configure.py and cmakefile
The scylla-gdb.py update is a bit hairry -- it needs to use task queues
list for IO classes names and shares, but to detect it should it checks
for the "commitlog" group is present.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#13963
* tools/cqlsh 8769c4c2...6e1000f1 (5):
> build: erase uid/gid information from tar archives
> Add github action to update the dockerhub description
> cqlsh: Add extension handler for "scylla_encryption_options"
> requirements.txt: update python-driver==3.26.0
> Add support for arm64 docker image
Closes#13878
CWG 2631 (https://cplusplus.github.io/CWG/issues/2631.html) reports
an issue on how the default argument is evaluated. this problem is
more obvious when it comes to how `std::source_location::current()`
is evaluated as a default argument. but not all compilers have the
same behavior, see https://godbolt.org/z/PK865KdG4.
notebaly, clang-15 evaluates the default argument at the callee
site. so we need to check the capability of compiler and fall back
to the one defined by util/source_location-compat.hh if the compiler
suffers from CWG 2631. and clang-16 implemented CWG2631 in
https://reviews.llvm.org/D136554. But unfortunately, this change
was not backported to clang-15.
before switching over to clang-16, for using std::source_location::current()
as the default parameter and expect the behavior defined by CWG2631,
we have to use the compatible layer provided by Seastar. otherwise
we always end up having the source_location at the callee side, which
is not interesting under most circumstances.
so in this change, all places using the idiom of passing
std::source_location::current() as the default parameter are changed
to use seastar::compat::source_location::current(). despite that
we have `#include "seastarx.h"` for opening the seastar namespace,
to disambiguate the "namespace compat" defined somewhere in scylladb,
the fully qualified name of
`seastar::compat::source_location::current()` is used.
see also 09a3c63345, where we used
std::source_location as an alias of std::experimental::source_location
if it was available. but this does not apply to the settings of our
current toolchain, where we have GCC-12 and Clang-15.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14086
There are few places in sstables/ code that require caller to specify priority class to pass it along to file stream options. All these callers use default class, so it makes little sense to keep it. This change makes the sched classes unification mega patch a bit smaller.
ref: #13963Closes#13996
* github.com:scylladb/scylladb:
sstables: Remove default prio class from rewrite_statistics()
sstables: Remove prio class from validate_checksums subs
sstables: Remove always default io-prio from validate_checksums()
All calls to sstables::validate_checksums() happen with explicitly
default priority class. Just hard-code it as such in the method
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Most of creators of index_reader construct it with default prio class,
null trace pointer and use_caching::yes. Assigning implicit defaults to
constructor arguments keeps the code shorter and easier to read.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
sstables_manager::get_component_lister() is used by sstable_directory.
and almost all the "ingredients" used to create a component lister
are located in sstable_directory. among the other things, the two
implementations of `components_lister` are located right in
`sstable_directory`. there is no need to outsource this to
sstables_manager just for accessing the system_keyspace, which is
already exposed as a public function of `sstables_manager`. so let's
move this helper into sstable_directory as a member function.
with this change, we can even go further by moving the
`components_lister` implementations into the same .cc file.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#13853
Currently the validate command uses the logger to output the result of
validation. This is inconsistent with other commands which all write
their output to stdout and log any additional information/errors to
stderr. This patch updates the validate command to do the same. While at
it, remove the "Validating..." message, it is not useful.