Handle abort_requested_exception exactly like
sleep_aborted, as an expected error when startup
is aborted mid-way.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closesscylladb/scylladb#15443
Some time ago populating of tables from sstables was reworked to use sstable states instead of full paths (#12707). Since then few places in the populator was left that still operate on the state-based subdirectory name. This PR collects most of those dangling ends
refs: #13020Closesscylladb/scylladb#15421
* github.com:scylladb/scylladb:
distributed_loader: Print sstable state explicitly
distributed_loader: Move check for the missing dir upper
distributed_loader: Use state as _sstable_directories key
when compiling the tests with -Wsign-compare, the compiler complains like:
```
/home/kefu/.local/bin/clang++ -DBOOST_ALL_DYN_LINK -DBOOST_NO_CXX98_FUNCTION_BASE -DDEBUG -DDEBUG_LSA_SANITIZER -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSANITIZE -DSCYLLA_BUILD_MODE=debug -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_BROKEN_SOURCE_LOCATION -DSEASTAR_DEBUG -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_TESTING_MAIN -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/cmake/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/cmake/seastar/gen/include -isystem /home/kefu/dev/scylladb/build/cmake/rust -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-mismatched-tags -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-unused-parameter -Wno-missing-field-initializers -Wno-deprecated-copy -Wno-ignored-qualifiers -march=westmere -Og -g -gz -std=gnu++20 -fvisibility=hidden -U_FORTIFY_SOURCE -DSEASTAR_SSTRING -Wno-error=unused-result "-Wno-error=#warnings" -fstack-clash-protection -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr -MD -MT test/boost/CMakeFiles/tablets_test.dir/tablets_test.cc.o -MF test/boost/CMakeFiles/tablets_test.dir/tablets_test.cc.o.d -o test/boost/CMakeFiles/tablets_test.dir/tablets_test.cc.o -c /home/kefu/dev/scylladb/test/boost/tablets_test.cc
/home/kefu/dev/scylladb/test/boost/tablets_test.cc:1335:53: error: comparison of integers of different signs: 'int' and 'size_t' (aka 'unsigned long') [-Werror,-Wsign-compare]
for (int log2_tablets = 0; log2_tablets < tablet_count_bits; ++log2_tablets) {
~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~
```
in this case, it should be safe to use an signed int as the loop
variable to be compared with `tablet_count_bits`, but let's just
appease the compiler so we can enable the warning option project-wide
to prevent any potential issues caused by signed-unsigned comparision.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#15449
We add garbage collection for the `CDC_GENERATIONS_V3` table to prevent
it from endlessly growing. This mechanism is especially needed because
we send the entire contents of `CDC_GENERATIONS_V3` as a part of the
group 0 snapshot.
The solution is to keep a clean-up candidate, which is one of the
already published CDC generations. The CDC generation publisher
introduced in #15281 continually uses this candidate to remove all
generations with timestamps not exceeding the candidate's and sets a new
candidate when needed.
We also add `test_cdc_generation_clearing.py` that verifies this new
mechanism.
Fixes#15323Closesscylladb/scylladb#15413
* github.com:scylladb/scylladb:
test: add test_cdc_generation_clearing
raft topology: remove obsolete CDC generations
raft topology: set CDC generation clean-up candidate
topology_coordinator: refactor publish_oldest_cdc_generation
system_keyspace: introduce decode_cdc_generation_id
system_keyspace: add cleanup_candidate to CDC_GENERATIONS_V3
in other words, do not create bpo::value unless transfer it to an
option_description.
`boost::program_options::value()` create a new typed_value<T> object,
without holding it with a shared_ptr. boost::program_options expects
developer to construct a `bpo::option_description` right away from it.
and `boost::program_options::option_description` takes the ownership
of the `type_value<T>*` raw pointer, and manages its life cycle with
a shared_ptr. but before passing it to a `bpo::option_description`,
the pointer created by `boost::program_options::value()` is a still
a raw pointer.
before this change, we initialize `operations_with_func` as global
variables using `boost::program_options::value()`. but unfortunately,
we don't always initialize a `bpo::option_description` from it --
we only do this on demand when the corresponding subcommand is
called.
so, if the corresponding subcommand is not called, the created
`typed_value<T>` objects are leaked. hence LeakSanitizer warns us.
after this change, we create the option map as a static
local variable in a function so it is created on demand as well.
as an alternative, we could initialize the options map as local
variable where it used. but to be more consistent with how
`global_option` is specified. and to colocate them in a single
place, let's keep the existing code layout.
this change is quite similar to 374bed8c3d
Fixes https://github.com/scylladb/scylladb/issues/15429Closesscylladb/scylladb#15430
* github.com:scylladb/scylladb:
tools/scylla-nodetools: reindent
tools/scylla-nodetools: do not create unowned bpo::value
column_stats::update_local_deletion_time() is not used anywhere,
what is being used is
`column_stats::update_local_deletion_time_and_tombstone_histogram(time_point)`.
while `update_local_deletion_time_and_tombstone_histogram(int32_t)`
is only used internally by a single caller.
neither is `column_stats::update(const deletion_time&)` used.
so let's drop them. and merge
`update_local_deletion_time_and_tombstone_histogram(int32_t)`
into its caller.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#15189
in this series, we do not assume the existence of "build" build directory. and prefer using the version files located under the directory specified with the `--build-dir` option.
Refs #15241Closesscylladb/scylladb#15402
* github.com:scylladb/scylladb:
create-relocatable-package.py: prefer $build_dir/SCYLLA-RELEASE-FILE
create-relocatable-package.py: create SCYLLA-RELOCATABLE-FILE with tempfile
in other words, do not create bpo::value unless transfer it to an
option_description.
`boost::program_options::value()` create a new typed_value<T> object,
without holding it with a shared_ptr. boost::program_options expects
developer to construct a `bpo::option_description` right away from it.
and `boost::program_options::option_description` takes the ownership
of the `type_value<T>*` raw pointer, and manages its life cycle with
a shared_ptr. but before passing it to a `bpo::option_description`,
the pointer created by `boost::program_options::value()` is a still
a raw pointer.
before this change, we initialize `operations_with_func` as global
variables using `boost::program_options::value()`. but unfortunately,
we don't always initialize a `bpo::option_description` from it --
we only do this on demand when the corresponding subcommand is
called.
so, if the corresponding subcommand is not called, the created
`typed_value<T>` objects are leaked. hence LeakSanitizer warns us.
after this change, we create the option map as a static
local variable in a function so it is created on demand as well.
as an alternative, we could initialize the options map as local
variable where it used. but to be more consistent with how
`global_option` is specified. and to colocate them in a single
place, let's keep the existing code layout.
this change is quite similar to 374bed8c3dFixes#15429
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
```console
$ journalctl --user start scylla-server -xe
Failed to add match 'start': Invalid argument
```
`journalctl` expects a match filter as its positional arguments.
but apparently, start is not a filter. we could use `--unit`
to specify a unit though, like:
```console
$ journalctl --user --unit scylla-server.service -xe
```
but it would flood the stdout with the logging messages printed
by scylla. this is not what a typical user expects. probably a better
use experience can be achieved using
```console
$ systemctl --user status scylla-server
```
which also print the current status reported by the service, and
the command line arguments. they would be more informative in typical
use cases.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#15390
The docker/podman tooling is destructive: it will happily
overwrite images locally and on the server. If a maintainer
forgets to update tools/toolchain/image, this can result
in losing an older toolchain container image.
To prevent that, check that the image name is new.
Closesscylladb/scylladb#15397
update commented out experimental_features to reflect the latest
experimental features:
- in 4f23eec4, "raft" was renamed to "consistent-topology-changes".
- in 2dedb5ea, "alternator-ttl" was moved out of experimental features.
- in 5b1421cc, "broadcast-tables" was added to experimental features.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#15407
pull_gitgub_pr.sh adds a "Closes #xyz" tag so github can close
the pull request after next promotion. Convert it to an absolute
refefence (scylladb/scylladb#xyz) so the commit can be cherry-picked
into another repository without the reference dangling.
Closes#15424
this change change `const` to `constexpr`. because the string literal
defined here is not only immutable, but also initialized at
compile-time, and can be used by constexpr expressions and functions.
this change is introduced to reduce the size of the change when moving
to compile-time format string in future. so far, seastar::format() does
not use the compile-time format string, but we have patches pending on
review implementing this. and the author of this change has local
branches implementing the changes on scylla side to support compile-time
format string, which practically replaces most of the `format()` calls
with `seastar::format()`.
without this change, if we use compile-time format check, compiler fails
like:
```
/home/kefu/dev/scylladb/tools/scylla-nodetool.cc:276:44: error: call to consteval function 'fmt::basic_format_string<char, const char *const &, seastar::basic_sstring<char, unsigned int, 15>>::basic_format_string<const char *, 0>' is not a constant expression
.description = seastar::format(description_template, app_name, boost::algorithm::join(operations | boost::adaptors::transformed([] (const auto& op) {
^
/usr/include/fmt/core.h:3148:67: note: read of non-constexpr variable 'description_template' is not allowed in a constant expression
FMT_CONSTEVAL FMT_INLINE basic_format_string(const S& s) : str_(s) {
^
/home/kefu/dev/scylladb/tools/scylla-nodetool.cc:276:44: note: in call to 'basic_format_string(description_template)'
.description = seastar::format(description_template, app_name, boost::algorithm::join(operations | boost::adaptors::transformed([] (const auto& op) {
^
/home/kefu/dev/scylladb/tools/scylla-nodetool.cc:258:16: note: declared here
const auto description_template =
^
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#15432
change the loop variable to `int` to silence warning like
```
/home/kefu/.local/bin/clang++ -DBOOST_NO_CXX98_FUNCTION_BASE -DDEBUG -DDEBUG_LSA_SANITIZER -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSANITIZE -DSCYLLA_BUILD_MODE=debug -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_BROKEN_SOURCE_LOCATION -DSEASTAR_DEBUG -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/cmake/seastar/gen/include -I/home/kefu/dev/scylladb/build/cmake/gen -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-mismatched-tags -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-unused-parameter -Wno-missing-field-initializers -Wno-deprecated-copy -Wno-ignored-qualifiers -march=westmere -Og -g -gz -std=gnu++20 -fvisibility=hidden -U_FORTIFY_SOURCE -DSEASTAR_SSTRING -Wno-error=unused-result "-Wno-error=#warnings" -fstack-clash-protection -fsanitize=address -fsanitize=undefined -fno-sanitize=vptr -MD -MT tools/CMakeFiles/tools.dir/scylla-nodetool.cc.o -MF tools/CMakeFiles/tools.dir/scylla-nodetool.cc.o.d -o tools/CMakeFiles/tools.dir/scylla-nodetool.cc.o -c /home/kefu/dev/scylladb/tools/scylla-nodetool.cc
/home/kefu/dev/scylladb/tools/scylla-nodetool.cc:215:28: error: comparison of integers of different signs: 'unsigned int' and 'int' [-Werror,-Wsign-compare]
for (unsigned i = 0; i < argc; ++i) {
~ ^ ~~~~
```
`i` is used as the index in a plain C-style array, it's perfectly fine
to use a signed integer as index in this case. as per C++ standard,
> The expression E1[E2] is identical (by definition) to *((E1)+(E2))
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#15431
in hope to lower the bar to testing object store.
* add language specifier for better readability of the document.
to highlight the config with YAML syntax
* add more specific comment on the AWS related settings
* explain that endpoint should match in the CREATE KEYSPACE
statement and the one defined by the YAML configuration.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#15433
Move node_ops related classes to node_ops/ so that they
are consistently grouped and could be access from
many modules.
Closes#15351
* github.com:scylladb/scylladb:
node_ops: extract classes related to node operations
node_ops: repair: move node_ops_id to node_ops directory
This commit adds the information that ScyllaDB Enterprise
supports FIPS-compliant systems in versions
2023.1.1 and later.
The information is excluded from OSS docs with
the "only" directive, because the support was not
added in OSS.
This commit must be backported to branch-5.2 so that
it appears on version 2023.1 in the Enterprise docs.
Closes#15415
We make the CDC generation publisher continually remove the
obsolete CDC generation data to prevent CDC_GENERATIONS_V3 from
endlessly growing. To achieve this, we use the clean-up candidate.
If it exists and can be safely removed, we remove it together with
all older CDC generations. We also mark the lack of a new
candidate. The next published CDC generation will become one.
Note this solution does not have any guarantee about "when"
it removes obsolete generations. Formally, it guarantees that
if there is a candidate that can be removed and the CDC generation
publisher attempts to remove it, all generations up to the
candidate are removed. In practice, when a new generation appears,
the publisher makes a new candidate or tries to remove an old
candidate, so obsolete generations can stay for a long time only
if no generation appears for a long time. But it is fine because
we only want to prevent CDC_GENERATIONS_V3 from growing too much.
Moreover, providing any guarantees would require a new wake-up
mechanism for the publisher, which would be hard to implement.
We want to use the clean-up candidates to remove the obsolete CDC
generation data, but first, we need to set suitable generations as
a candidate when there is no candidate. Since CDC generations must
be published before we remove them, a generation that is being
published is a good candidate.
In the following commits, we add a new task for the CDC generation
publisher -- clearing obsolete CDC generation data. This task
can be done together with the publishing under one group 0 guard.
We refactor publish_oldest_cdc_generation to make it possible.
Now, this function is more like a command builder. It takes guard
by const reference and updates the vector of mutations and the
reason string. The CDC generation publisher uses them directly to
update the topology at the end after finishing building the
command. This logic will be more visible after adding the clearing
task.
Currently, we only log anything about what was tried w.r.t. obtaining
the schema if it failed. Add a log message to the success path too, so
in case the wrong schema was successfully loaded, the user can find the
problem.
The log message is printed with debug-level, so it doesn't distrurb
output by default.
Fixes: #15384Closes#15417
in this series,
- the build of unstripped package is fixed, and
- the targets for building deb and rpm packages are added. these targets builds deb and rpm packages from the unstripped package.
Closes#15403
* github.com:scylladb/scylladb:
build: cmake: add targets for building deb and rpm packages
build: cmake: correct the paths used when building unstripped pkg
cassandra-stress connects to "localhost" by default. that's exactly the
use case when we install scylla using the unified installer. so do not
suggest "-node xxx" option. the "xxx" part is but confusing.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#15411
This series introduces a scylla-native nodetool. It is invokable via the main scylla executable as the other native tools we have. It uses the seastar's new `http::client` to connect to the specified node and execute the desired commands.
For now a single command is implemented: `nodetool compact`, invokable as `scylla nodetool compact`. Once all the boilerplate is added to create a new tool, implementing a single command is not too bad, in terms of code-bloat. Certainly not as clean as a python implementation would be, but good enough. The advantages of a C++ implementation is that all of us in the core team know C++ and that it is shipped right as part of the scylla executable..
Closes#14841
* github.com:scylladb/scylladb:
test: add nodetool tests
test.py: add ToolTestSuite and ToolTest
tools/scylla-nodetool: implement compact operation
tools/scylla-nodetool: implement basic scylla_rest_api_client
tools: introduce scylla-nodetool
utils: export dns_connection_factory from s3/client.cc to http.hh
utils/s3/client: pass logger to dns_connection_factory in constructor
tools/utils: tool_app_template::run_async(): also detect --help* as --help
Load balancer will recognize decommissioning nodes and will
move tablet replicas away from such nodes with highest priority.
Topology changes have now an extra step called "tablet draining" which
calls the load balancer. The step will execute tablet migration track
as long as there are nodes which require draining. It will not do regular
load balancing.
If load balancer is unable to find new tablet replicas, because RF
cannot be met or availability is at risk due to insufficient node
distribution in racks, it will throw an exception. Currently, topology
change will retry in a loop. We should make this error cause topology
change to be aborted. There is no infrastructure for
aborts yet, so this is not implemented.
Closes#15197
* github.com:scylladb/scylladb:
tablets, raft topology: Add support for decommission with tablets
tablet_allocator: Compute load sketch lazily
tablet_allocator: Set node id correctly
tablet_allocator: Make migration_plan a class
tablets: Implement cleanup step
storage_service, tablets: Prevent stale RPCs from running beyond their stage
locator: Introduce tablet_metadata_guard
locator, replica: Add a way to wait for table's effective_replication_map change
storage_service, tablets: Extract do_tablet_operation() from stream_tablet()
raft topology: Add break in the final case clause
raft topology: Fix SIGSEGV when trace-level logging is enabled
raft topology: Set node state in topology
raft topology: Always set host id in topology
When a column family's schema is changed new compaction
strategy type may be applied.
To make sure that it will behave as expected, compaction
strategy need to contain only the allowed options and values.
Methods throwing exception on invalid options are added.
Fixes: #2336.
Closes#13956
* github.com:scylladb/scylladb:
test: add test for compaction strategy validation
compaction: unify exception messages
compaction: cql3: validate options in check_restricted_table_properties
compaction: validate options used in different compaction strategies
compaction: validate common compaction strategy options
compaction: split compaction_strategy_impl constructor
compaction: validate size_tiered_compaction_strategy specific options
compaction: validate time_window_compaction_strategy specific options
compaction: add method to validate min and max threshold
compaction: split size_tiered_compaction_strategy_options constructor
compaction: make compaction strategy keys static constexpr
compaction: use helpers in validate_* functions
compaction: split time_window_compaction_strategy_options construtor
compaction: add validate method to compaction_strategy_options
time_window_compaction_strategy_options: make copy and move-able
size_tiered_compaction_strategy_options: make copy and move-able
When populating from a particular directory, populator code converts
state to subdir name, then prints the path. The conversion is pretty
much artificial, it's better to provide printer for state and print
state explicitly.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The quarantine directory can be missing on the datadir and that's OK. In
order to check that and skip population the populator code uses two-step
logic -- first it checks if the directory exists and either puts or not
the sstable_directory object into the map. Later it checks the map and
decide whether to throw or not if the directory is missing.
Let's keep both check and throw in one place for brevity.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The populator maintains a map of path -> sstable_directory pairs one for
each subdirectory for every sstable state. The "path" is in fact not
used by the logic as it's just a subdirectory name for the state and the
rest of the core operates on state. So it's good to make the map of
directories also be indexed by the state.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Currently, exceptions thrown from `sst->load` are unhandled,
resulting in, e.g.:
```
ERROR 2023-09-12 08:02:58,124 [shard 0:main] seastar - Exiting on unhandled exception: std::runtime_error (SSTable /home/bhalevy/.dtest/dtest-dxg4xdxg/test/node1/data/ks/cf-a3009f20512911ee8000d81cd2da3fd7/me-3g9b_0e0x_39vtt1y2rcqrffz55j-big-Data.db uses org.apache.cassandra.dht.Murmur3Partitioner partitioner which is different than com.scylladb.dht.CDCPartitioner partitioner used by the database)
```
Log the errors and exit the tool with non-zero status
in this case.
Fixes#15359
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#15376
Load balancer will recognize decommissioning nodes and will
move tablet replicas away from such nodes with highest priority.
Topology changes have now an extra step called "tablet draining" which
calls the load balancer. The step will execute tablet migration track
as long as there are nodes which require draining. It will not do regular
load balancing.
If load balancer is unable to find new tablet replicas, because RF
cannot be met or availability is at risk due to insufficient node
distribution in racks, it will throw an exception. Currently, topology
change will retry in a loop. We should make this error cause topology
change to be paused so that admin becomes aware of the problem and
issues an abort on the topology change. There is no infrastructure for
aborts yet, so this is not implemented.
This change adds a stub for tablet cleanup on the replica side and wires
it into the tablet migration process.
The handling on replica side is incomplete because it doesn't remove
the actual data yet. It only flushes the memtables, so that all data
is in sstables and none requires a memtable flush.
This patch is necessary to make decommission work. Otherwise, a
memtable flush would happen when the decommissioned node is put in the
drained state (as in nodetool drain) and it would fail on missing host
id mapping (node is no longer in topology), which is examined by the
tablet sharder when producing sstable sharding metadata. Leading to
abort due to failed memtable flush.
Example scenario:
1. coordinator A sends RPC #1 to trigger streaming
2. coordinator fails over to B
3. coordinator B performs streaming successfully
4. RPC #1 arrives and starts streaming
5. coordinator B commits the transition to the post-streaming stage
6. coordinator B executes global token metadata barrier
We end up with streaming running despite the fact that the current
coordinator moved on. Currently, this won't happen, because streaming
holds on to erm. But we want to change that (see #14995), so that it
does not block barriers for migrations of other tablets. The same
problem applies to tablet cleanup.
The fix is to use tablet_metadata_guard around such long running
operations, which will keep hold to erm so that in the above scenario
coordinator B will wait for it in step 6. The guard ensures that erm
doesn't block other migrations because it switches to the latest erm
if it's compatible. If it's not, it signals abort_source for the guard
so that such stale operation aborts soon and the barrier in step 6
doesn't wait for long.
Will be used to synchronize long-running tablet operations with
topology coordinator.
It blocks barriers like erm_ptr, but refreshes if change is
irrelevant, so behaves as if the erm_ptr's scope was narrowed down to
a single tablet.
The decode_cdc_generations_ids function allows us to decode
a vector of CDC generation IDs. After adding cleanup_candidate
to CDC_GENERATIONS_V3, we need a similar function that decodes
a single ID.
In the following commits, we implement a garbage collection for
CDC_GENERATIONS_V3. The first step is introducing the clean-up
candidate. It will be continually updated by the CDC generation
publisher and used to remove obsolete data.