PR https://github.com/scylladb/scylladb/pull/18186 introduced a fiber that reloads reclaimed bloom filters when memory becomes available. Use maintenance scheduling group to run that fiber instead of running it in the main scheduling group.
Fixes#18675Closesscylladb/scylladb#18721
* github.com:scylladb/scylladb:
sstables_manager: use maintenance scheduling group to run components reload fiber
sstables_manager: add member to store maintenance scheduling group
This PR resolves issue with double count of the test result for topology tests. It will not appear in the consolidated report anymore.
Another fix is to provide a better view which test failed by modifying the test case name in the report enriching it with mode and run id, so making them unique across the run.
The scope of this change is:
1. Modify the test name to have run id in name
2. Add handlers to get logs of test.py and pytest in one file that are related to test, rather than to the full suite
3. Remove topology tests from aggregating them on a suite level in Junit results
4. Add a link to the logs related to the failed tests in Junit results, so it will be easier to navigate to all logs related to test
5. Gather logs related to the failed test to one directory for better logs investigation
Ref: scylladb/scylladb#17851Closesscylladb/scylladb#18277
in this change, we trade the `boost_test_print_type()` overloads
for the generic template of `boost_test_print_type()`, except for
those in the very small tests, which presumably want to keep
themselves relative self-contained.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#18727
Store that maintenance scheduling group inside the sstables_manager. The
next patch will use this to run the components reloader fiber.
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
The DIGEST_FOR_NULL_VALUES feature was added in 21a77612b3 (2020; 4.4)
and can now be assumed to be always present. The hasher which it invoked
is removed.
The PER_TABLE_PARTITIONERS feature was added in 90df9a44ce (2020; 4.0)
and can now be assumed to be always present. We also remove the associated
schema_feature.
The first digest tested was generated without the PER_TABLE_PARTITIONERS
schema feature. We're about to make that feature mandatory, so we won't
be able (and won't need) to generate a digest without it.
Update the digest to include the feature. Note it wasn't untested before,
we have a test with schema_features::full().
The VIEW_VIRTUAL_COLUMNS feature was added in a108df09f9 (2019; 3.1)
and can now be assumed to be always present.
The corresponding schema_feature is removed. Note schema_features are not sent
over the wire. A digest calculation without VIEW_VIRTUAL_COLUMNS is no longer tested.
These tests were marked as xfail because they use to fail with tablets.
They don't anymore, so remove the xfail.
Fixes: #16486Closesscylladb/scylladb#18671
Even when configured to not do any validation at all, the validator still did some. This small series fixes this, and adds a test to check that validation levels in general are respected, and the validator doesn't validate more than it is asked to.
Fixes: #18662Closesscylladb/scylladb#18667
* github.com:scylladb/scylladb:
test/boost/mutation_fragment_test.cc: add test for validator validation levels
mutation: mutation_fragment_stream_validating_filter: fix validation_level::none
mutation: mutation_fragment_stream_validating_filter: add raises_error ctor parameter
This commit brings several new features in scylla_cluster.py to fix runaway asyncio task problems in topology tests
- Start-Stop Lock and Stop Event in ScyllaServer
- Tasks History, Wait for tasks from Tasks History and Manager broken state in ScyllaClusterManager
- make ManagerClient object function scope
- test_finished_event in ManagerClient
Fixes: scylladb/scylladb#16472Fixes: scylladb/scylladb#16651Closesscylladb/scylladb#18236
* github.com:scylladb/scylladb:
test/pylib: Introduce ManagerClient.test_finished_event
test/topology: make ManagerClient object function scope
test/pylib: Introduce Manager broken state:
test/pylib: Wait for tasks from Tasks History:
test/pylib: Introduce Tasks History:
test/pylib: Introduce Stop Event
test/pylib: Introduce Start-Stop Lock:
before this change, we use `update_item_suffix` as a format string
fed to `format(...)`, which is resolved to `seastar::format()`.
but with a patch which migrates the `seastar::format()` to the backend
with compile-time format check, the caller sites using `format()` would
fail to build, because `update_item_suffix` is not a `constexpr`:
```
/home/kefu/.local/bin/clang++ -DFMT_SHARED -DSCYLLA_BUILD_MODE=release -DSEASTAR_API_LEVEL=7 -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"RelWithDebInfo\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -isystem /home/kefu/dev/scylladb/abseil -isystem /home/kefu/dev/scylladb/build/rust -ffunction-sections -fdata-sections -O3 -g -gz -std=gnu++20 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -mllvm -inline-threshold=2500 -fno-slp-vectorize -U_FORTIFY_SOURCE -Werror=unused-result -MD -MT test/perf/CMakeFiles/test-perf.dir/RelWithDebInfo/perf_alternator.cc.o -MF test/perf/CMakeFiles/test-perf.dir/RelWithDebInfo/perf_alternator.cc.o.d -o test/perf/CMakeFiles/test-perf.dir/RelWithDebInfo/perf_alternator.cc.o -c /home/kefu/dev/scylladb/test/perf/perf_alternator.cc
/home/kefu/dev/scylladb/test/perf/perf_alternator.cc:249:69: error: call to consteval function 'fmt::basic_format_string<char, const char (&)[1]>::basic_format_string<const char *, 0>' is not a constant expression
249 | return make_request(cli, "UpdateItem", prefix + seastar::format(update_item_suffix, ""));
| ^
/usr/include/fmt/core.h:2776:67: note: read of non-constexpr variable 'update_item_suffix' is not allowed in a constant expression
2776 | FMT_CONSTEVAL FMT_INLINE basic_format_string(const S& s) : str_(s) {
| ^
/home/kefu/dev/scylladb/test/perf/perf_alternator.cc:249:69: note: in call to 'basic_format_string<const char *, 0>(update_item_suffix)'
249 | return make_request(cli, "UpdateItem", prefix + seastar::format(update_item_suffix, ""));
| ^~~~~~~~~~~~~~~~~~
/home/kefu/dev/scylladb/test/perf/perf_alternator.cc:198:6: note: declared here
198 | auto update_item_suffix = R"(
| ^
```
so, to prepare the change switching to compile-time format checking,
let's mark this variable `static constexpr`. this is also more correct,
as this variable is
* a compile time constant, and
* is not shared across different compilation units.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#18685
The function intersection(r1,r2) in statement_restrictions.cc is used
when several WHERE restrictions were applied to the same column.
For example, for "WHERE b<1 AND b<2" the intersection of the two ranges
is calculated to be b<1.
As noted in issue #18690, Scylla is inconsistent in where it allows or
doesn't allow these intersecting restrictions. But where they are
allowed they must be implemented correctly. And it turns out the
function intersection() had a bug that caused it to sometimes enter
an infinite loop - when the intent was only to call itself once with
swapped parameters.
This patch includes a test reproducing this bug, and a fix for the
bug. The test hangs before the fix, and passes after the fix.
While at it, I carefully reviewed the entire code used to implement
the intersection() function to try to make sure that the bug we found
was the only one. I also added a few more comments where I thought they
were needed to understand complicated logic of the code.
The bug, the fix and the test were originally discovered by
Michał Chojnowski.
Fixes#18688
Refs #18690
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#18694
This series is a reupload of #13792 with a few modifications, namely a test is added and the conflicts with recent tablet related changes are fixed.
See https://github.com/scylladb/scylladb/issues/12379 and https://github.com/scylladb/scylladb/pull/13583 for a detailed description of the problem and discussions.
This PR aims to extend the existing throttling mechanism to work with requests that internally generate a large amount of view updates, as suggested by @nyh.
The existing mechanism works in the following way:
* Client sends a request, we generate the view updates corresponding to the request and spawn background tasks which will send these updates to remote nodes
* Each background task consumes some units from the `view_update_concurrency_semaphore`, but doesn't wait for these units, it's just for tracking
* We keep track of the percent of consumed units on each node, this is called `view update backlog`.
* Before sending a response to the client we sleep for a short amount of time. The amount of time to sleep for is based on the fullness of this `view update backlog`. For a well behaved client with limited concurrency this will limit the amount of incoming requests to a manageable level.
This mechanism doesn't handle large DELETE queries. Deleting a partition is fast for the base table, but it requires us to generate a view update for every single deleted row. The number of deleted rows per single client request can be in the millions. Delaying response to the request doesn't help when a single request can generate millions of updates.
To deal with this we could treat the view update generator just like any other client and force it to wait a bit of time before sending the next batch of updates. The amount of time to wait for is calculated just like in the existing throttling code, it's based on the fullness of `view update backlogs`.
The new algorithm of view update generation looks something like this:
```c++
for(;;) {
auto updates = generate_updates_batch_with_max_100_rows();
co_await seastar::sleep(calculate_sleep_time_from_backlogs());
spawn_background_tasks_for_updates(updates);
}
```
Fixes: https://github.com/scylladb/scylladb/issues/12379Closesscylladb/scylladb#16819
* github.com:scylladb/scylladb:
test: add test for bad_allocs during large mv queries
mv: throttle view update generation for large queries
exceptions: add read_write_timeout_exception, a subclass of request_timeout_exception
db/view: extract view throttling delay calculation to a global function
view_update_generator: add get_storage_proxy()
storage_proxy: make view backlog getters public
In preparation for intra-node tablet migration, to avoid
using deprecated sharder APIs.
This function is used for generating sstable sharding metadata.
For tablets, it is not invoked, so we can safely work with the
static sharder. The call site already passes static_sharder only.
Before the patch, dht::sharder could be instantiated and it would
behave like a static sharder. This is not safe with regards to
extensions of the API because if a derived implementation forgets to
override some method, it would incorrectly default to the
implementation from static sharder. Better to fail the compilation in
this case, so extract static sharder logic to dht::static_sharder
class and make all methods in dht::sharder pure virtual.
This also allows us to have algorithms indicate that they only work
with static sharder by accepting the type, and have compile-time
safety for this requirement.
schema::get_sharder() is changed to return the static_sharder&.
Require users to specify whether we want shard for reads or for writes
by switching to appropriate non-deprecated variant.
For example, shard_of() can be replaced with shard_for_reads() or
shard_for_writes().
The next_shard/token_for_next_shard APIs have only for-reads variant,
and the act of switching will be a testimony to the fact that the code
is valid for intra-node migration.
Tablet sharder is adjusted to handle intra-migration where a tablet
can have two replicas on the same host. For reads, sharder uses the
read selector to resolve the conflict. For writes, the write selector
is used.
The old shard_of() API is kept to represent shard for reads, and new
method is introduced to query the shards for writing:
shard_for_writes(). All writers should be switched to that API, which
is not done in this patch yet.
The request handler on replica side acts as a second-level
coordinator, using sharder to determine routing to shards. A given
sharder has a scope of a single topology version, a single
effective_replication_map_ptr, which should be kept alive during
writes.
balance_tablets() is invoked in a loop, so only the first call will
see non-empty skiplist.
This bug starts to manifest after adding intra-node migration plan,
causing failures of the test_load_balancing_with_skiplist test
case. The reason is that rebalancing will now require multiple passes
before convergence is reached, due to intra-node migrations, and later
calls will not see the skiplist and try to balance skipped nodes,
vioating test's assertions.
Currently, invoking `nodetool ring` on a tablet keyspace fails with an error, because it doesn't pass the required table parameter to `/storage_service/ownership/{keyspace}`. Further to this, the command will currently always output the vnode ring, regardless of the keyspace and table parameter. This series fixes this, adding tablet support to `/storage_service/tokens_endpoint`, which will now return the tablet ring (tablet token -> tablet primary replica mapping) if the new keyspace and table parameters are provided.
`nodetool status` also gets a touch-up, to provide the tablet ring's token count (the tablet count) when invoked with a tablet keyspace and table.
Fixes: #17889Fixes: #18474
- [x] ** native-nodetool is new functionality, no backport is needed **
Closesscylladb/scylladb#18608
* github.com:scylladb/scylladb:
test/nodetool: make test pass with cassandra nodetool
tools/scylla-nodetool: status: fix token count for tablets
tools/scylla-nodetool: add tablet support to ring command
api/storage_service: add tablet support for /storage_service/tokens_endpoint
service/storage_service: introduce get_tablet_to_endpoint_map()
locator/tablets: introduce the primary replica concept
introduce ManagerClient.test_finished_event
to block access to REST client object from the test if
ManagerClient.after_test method was called
(test teardown started)
Recent commit 12f160045b (Get rid of fb_utilities) replaced the usage of global fb_utilities and made all services use topology::my_address() in order to get local node broadcast address. Some places resulted in long dependency chains dereferences. to get to topology This PR fixes some of them.
Closesscylladb/scylladb#18672
* github.com:scylladb/scylladb:
service_level_controller_test: Use topology::is_me() helper
service_level_controller: Add dependency on shared_token_metadata
tracing: Get my_address() via proxy
storage_proxy: Get token metadata via local member, not database
Currently all tables are printed in statements like `DESC TABLES`, `DESC KEYSPACE ks` or `DESC SCHEMA`.
But when we create a table with cdc enabled, additional table with `_scylla_cdc_log` suffix is created.
Those tables shouldn't be recreated manually but created automatically when the base table is created.
This patch hides tables with `_scylla_cdc_log` suffix in all describe statements.
To preserve properties values of those tables, `ALTER TABLE` statement with all properties and their current values for log cdc table is added to description of the base table.
Fixes#18459Closesscylladb/scylladb#18467
* github.com:scylladb/scylladb:
test/cql-pytest/test_describe: add test for hiding cdc tables
cql3/statements/describe_statement: hide cdc tables
schema: add a method to generate ALTER statement with all properties
schema: extract schema's properties generation
move ManagerClient object creation/clear
to functions scope instead of session scope
to prevent test cases affect each other
by stopping sharing connections to cluster between tests
Waiting for all tasks does not guarantee
that test will not spawn new tasks while we wait
Manager broken state prevents all future put requests in case of
1) fail during task waiting
2) Test continue to create tasks in test_after stage
To ensure the atomicity of tests and recycle clusters without any issues, it is crucial
that all active requests in ScyllaClusterManager are completed before proceeding further.
Topology tests might spawn asynchronous tasks in parallel in ScyllaClusterManager.
Tasks history is introduced to be able log and analyze all actions
against cluster in case of failures
The methods stop, stop_gracefully, and start in ScyllaServer
are not designed for parallel execution.
To circumvent issues arising from concurrent calls,
a start_stop_lock has been introduced.
This lock ensures that these methods are executed sequentially.
Currently, if the fill ctor throws an exception,
the destructor won't be called, as it object is not fully constructed yet.
Call the default ctor first (which doesn't throw)
to make sure the destructor will be called on exception.
Fixesscylladb/scylladb#18635
- [x] Although the fixes is for a rare bug, it has very low risk and so it's worth backporting to all live versions
Closesscylladb/scylladb#18636
* github.com:scylladb/scylladb:
chunked_vector_test: add more exception safety tests
chunked_vector_test: exception_safe_class: count also moved objects
utils: chunked_vector: fill ctor: make exception safe
This patch adds a test for reproducing issue #12379, which is
being fixed in #16819.
The test case works by creating a table with a materialized
view, and then performing a partition delete query on it.
At the same time, it uses injections to limit the memory
to a level lower than usual, in order to increase the
consistency of the test, and to limit its runtime.
Before #16819, the test would exceed the limit and fail,
and now the next allocation is throttled using a sleep.
We have to account for moved objects as well
as copied objects so they will be balanced with
the respective `del_live_object` calls called
by the destructor.
However, since chunked_vector requires the
value_type to be nothrow_move_constructible,
just count the additional live object, but
do not modify _countdown or, respectively, throw
an exception, as this should be considered only
for the default and copy constructors.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>