Finishing the deprecation of the skip_mode function in favor of
pytest.mark.skip_mode. This PR is only cleaning and migrating leftover tests
that are still used and old way of skip_mode.
Closesscylladb/scylladb#28299
Function skip_mode works only on function and only in cluster test. This if OK
when we need to skip one test, but it's not possible to use it with pytestmark
to automatically mark all tests in the file. The goal of this PR is to migrate
skip_mode to be dynamic pytest.mark that can be used as ordinary mark.
Closesscylladb/scylladb#27853
[avi: apply to test/cluster/test_tablets.py::test_table_creation_wakes_up_balancer]
After 39cec4a node join may fail with either "init - Startup failed"
notification or occasionally because it was banned, depending on timing.
The change updates the test to handle both cases.
Fixes: scylladb/scylladb#27697
No backport: This failure is only present in master.
Closesscylladb/scylladb#27768
We've enabled the configuration option `rf_rack_valid_keyspaces`,
so we can finally re-enable the events creating and dropping secondary
indexes.
Fixesscylladb/scylladb#26422
We adjust the test to work with the configuration option
`rf_rack_valid_keyspaces` enabled. For that, we ensure that there is
always at least one node in each of the three racks. This way, all
keyspaces we create and manipulate will remain RF-rack-valid since they
all use RF=3.
------------------------------------------------------------------------
To achieve that, we only need to adjust the following events:
1. `init_tablet_transfer`
The event creates a new keyspace and table and manually migrates
a tablet belonging to it. As long as we make sure the migration occurs
within the same rack, there will be no problem.
Since RF == #racks, each rack will have exactly one tablet replica,
so we can migrate the tablet to an arbitrary node in the same rack.
Note that there must exist a node that's not a replica. If there weren't
such a node, the test wouldn't have worked before this commit because
it's not possible to migrate a tablet from one node being its replica to
another. In other words, we have a guarantee that there are at least 4 nodes
in the cluster when we try to migrate a tablet replica.
That said, we check it anyway. If there's no viable node to migrate the
tablet replica to, we log that information and do nothing. That should be
an acceptable solution.
2. `add_new_node`
As long as we add a node to an existing rack, there's no way to
violate the invariant imposed by the configuration option, so we pick
a random rack out of the existing three and create a node in it.
3. `decommission_node`
We need to ensure that the node we'll be trying to decommission is
not the only one in its rack.
Following pretty much the same reasoning as in `init_tablet_transfer`,
we conclude there must be a rack with at least two nodes in it. Otherwise
we'd end up having to migrate a tablet from one replica node to another,
which is not possible.
What's more, decommissioning a node is not possible if any node in
the cluster is dead, so we can assume that `manager.running_servers`
returns the whole cluster.
4. `remove_node`
The same as `decommission_node`. Just note although the node we choose to
remove must be first stopped, none other node can be dead, so the whole
cluster must be returned by `manager.running_servers`.
------------------------------------------------------------------------
There's one more important thing to note. The test may sometimes trigger
a sequence of events where a new node is started, but, due to an error
injection, its initialization is not completed. Among other things, the
node may NOT have a host ID recognized by the rest of the nodes in the
cluster, and operations like tablet migration will fail if they target
it.
Thankfully, there seems to be a way to avoid problems stemming from
that. When a new node is added to the cluster, it should appear at the
end of the list returned by `manager.running_servers`. This most likely
stems from how dictionaries work in Python:
"Keys and values are iterated over in insertion order."
-- https://docs.python.org/3/library/stdtypes.html#dict-views
and the fact that we keep track of running servers using a dictionary.
Furthermore, we rely on the assumption that the test currently works
correctly.
Assume, to the contrary, that among the nodes taking part in the operations
listed above, there is at most one node per rack that has its host ID recognized
by the rest of the cluster. Note that only those nodes can store any tablets.
Let's refer to the set of those nodes as X.
Assume that we're dealing with tablet migration, decommissioning, or removing
a node. Since those operations involve tablet migration, at least one tablet
will need to be migrated from the node in question to another node in X.
However, since X consists of at most three nodes, and one of them is losing
its tablet, there is no viable target for the tablet, so the operation fails.
Using those assumptions, an auxiliary function, `select_viable_rack`,
was designed to carefully choose a correct rack, which we'll then pick nodes
from to perform the topological operations. It's simple: we just find the first
rack in the list that has at least two nodes in it. That should ensure that we
perform an operation that doesn't lead to any unforeseen disaster.
------------------------------------------------------------------------
Since the test effectively becomes more complex due to more care for keeping
the topology of the cluster valid, we extend the log messages to make them
more helpful when debugging a failure.
We extend the requirements for being able to create materialized views
and secondary indexes in tablet-based keyspaces. It's now necessary to
enable the configuration option `rf_rack_valid_keyspaces`. This is
a stepping stone towards bringing materialized views and secondary
indexes with tablets out of the experimental phase.
We add a validation test to verify the changes.
Refs scylladb/scylladb#23030
Materialized views are going to require the configuration option
`rf_rack_valid_keyspaces` when being created in tablet-based keyspaces.
Since random-failure tests still haven't been adjusted to work with it,
and because it's not trivial, we skip the cases when we end up creating
or dropping an index.
initial implementation to support CDC in tablets-enabled keyspaces.
The design is described in https://docs.google.com/document/d/1qO5f2q5QoN5z1-rYOQFu6tqVLD3Ha6pphXKEqbtSNiU/edit?usp=sharing
It is followed closely for the most part except "Deciding when to change streams" - instead, streams are changed synchronously with tablet split / merge.
Instead of the stream switching algorithm with the double writes, we use a scheme similar to the previous method for vnodes - we add the new streams with timestamp that is sufficiently far into the future.
In this PR we:
* add new group0-based internal system tables for tablet stream metadata and loading it into in-memory CDC metadata
* add virtual tables for CDC consumers
* the write coordinator chooses a stream by looking up the appropriate stream in the CDC metadata
* enable creating tables with CDC enabled in tablets-enabled keyspaces. tablets are allocated for the CDC table, and a stream is created per each tablet.
* on tablet resize (split / merge), the topology coordinator creates a new stream set with a new stream for each new tablet.
* the cdc tablets are co-located with the base tablets
Fixes https://github.com/scylladb/scylladb/issues/22576
backport not needed - new feature
update dtests: https://github.com/scylladb/scylla-dtest/pull/5897
update java cdc library: https://github.com/scylladb/scylla-cdc-java/pull/102
update rust cdc library: https://github.com/scylladb/scylla-cdc-rust/pull/136Closesscylladb/scylladb#23795
* github.com:scylladb/scylladb:
docs/dev: update CDC dev docs for tablets
doc: update CDC docs for tablets
test: cluster_events: enable add_cdc and drop_cdc
test/cql: enable cql cdc tests to run with tablets
test: test_cdc_with_alter: adjust for cdc with tablets
test/cqlpy: adjust cdc tests for tablets
test/cluster/test_cdc_with_tablets: introduce cdc with tablets tests
cdc: enable cdc with tablets
topology coordinator: change streams on tablet split/merge
cdc: virtual tables for cdc with tablets
cdc: generate_stream_diff helper function
cdc: choose stream in tablets enabled keyspaces
cdc: rename get_stream to get_vnode_stream
cdc: load tablet streams metadata from tables
cdc: helper functions for reading metadata from tables
cdc: colocate cdc table with base
cdc: remove streams when dropping CDC table
cdc: create streams when allocating tablets
migration_listener: add on_before_allocate_tablet_map notification
cdc: notify when creating or dropping cdc table
cdc: move cdc table creation to pre_create
cdc: add internal tables for cdc with tablets
cdc: add cdc_with_tablets feature flag
cdc: add is_log_schema helper
The Raft topology workflow was changed by the limited voters feature:
nodes no longer request votership themselves. As a result, the
"stop_before_becoming_raft_voter" error injection is now obsolete and
has been removed.
Fixes: scylladb/scylladb#23418
The order of entries in the ERROR_INJECTIONS list determines test
repeatability for a given random seed.
To allow removing error injections without affecting the order of the
remaining ones, removed injections are now renamed with a "REMOVED_"
prefix instead of being deleted.
This ensures they are ignored by the tests, while the sequence of active
injections—and thus test reproducibility—remains unchanged.
Rework `ScyllaLogFile.wait_for()` method to make it easier
to add required methods to ScyllaNode class of ccm-like shim.
Also, added `ScyllaLogFile.grep_for_errors()` method and
reworked `ScyllaLogFile.grep()`
Some of the tests in the test suite have proven to be more problematic
in adjusting to RF-rack-validity. Since we'd like to run as many tests
as possible with the `rf_rack_valid_keyspaces` configuration option
enabled, let's disable it in those. In the following commit, we'll enable
it by default.
The workflow of becoming a voter changes with the "limited voters"
feature, as the node will no longer become a voter on its own, but the
votership is being managed by the topology coordinator. This therefore
breaks the `stop_before_becoming_raft_voter` test, as that injection
relies on the old behavior.
We will disable the test for this particular case for now and address
either fixing of complete removal of the test in a follow-up task.
Refs: scylladb/scylladb#23418
After recent changes #18640 and #19151 started to reproduce for
stop_after_sending_join_node_request and
stop_after_bootstrapping_initial_raft_configuration error injections too.
The solution is the same: deselect the tests.
Fixes#23302Closesscylladb/scylladb#23405