Consider the following scenario:
1. A table has RF=3 and writes use CL=QUORUM
2. One node is down
3. There is a pending tablet migration from the unavailable node
that is reverted
During the revert, there can be a time window where the pending replica
being cleaned up still accepts writes. This leads to write failures,
as only two nodes (out of four) are able to acknowledge writes.
This patch fixes the issue by adding a barrier to the cleanup_target
tablet transition state, ensuring that the coordinator switches back to
the previous replica set before cleanup is triggered.
Fixes https://github.com/scylladb/scylladb/issues/26512
The patch prepares the test for additional write workload to be
executed in parallel with node failures. With the original RF=2,
QUORUM is also 2, which causes writes to fail during node outage.
To address it, the third rack with a single node is added and the
replication factor is increased to 3.
Choose old_replica and new_replica so that they're both in rack r1.
After later changes (rack list auto expansion), it's no longer
guaranteed that the first replica will be on r1.
The test started hitting #21779 recently. We deflake it in this commit
by disabling the tablet load balancing before dropping the keyspace at
the end of the test.
We still have to understand why the test started hitting #21779, so we
keep #25938 open.
Refs #25938
This commits introduces an config option 'tablet_load_stats_refresh_interval_in_seconds'
that allows overriding the default value without using error injection.
Fixesscylladb/scylladb#24641Closesscylladb/scylladb#24746
When a tablet transitions to a post-cleanup stage on the leaving replica
we deallocate its storage group. Before the storage can be deallocated
and destroyed, we must make sure it's cleaned up and stopped properly.
Normally this happens during the tablet cleanup stage, when
table::cleanup_table is called, so by the time we transition to the next
stage the storage group is already stopped.
However, it's possible that tablet cleanup did not run in some scenario:
1. The topology coordinator runs tablet cleanup on the leaving replica.
2. The leaving replica is restarted.
3. When the leaving replica starts, still in `cleanup` stage, it
allocates a storage group for the tablet.
4. The topology coordinator moves to the next stage.
5. The leaving replica deallocates the storage group, but it was not
stopped.
To address this scenario, we always stop the storage group when
deallocating it. Usually it will be already stopped and complete
immediately, and otherwise it will be stopped in the background.
Fixesscylladb/scylladb#24857Fixesscylladb/scylladb#24828Closesscylladb/scylladb#24896
Add a test that reproduces issue scylladb/scylladb#23481.
The test migrates a tablet from one node to another, and while the
tablet is in some stage of cleanup - either before or right after,
depending on the parameter - the leaving replica, on which the tablet is
cleaned, is restarted.
This is interesting because when the leaving replica starts and loads
its state, the tablet could be in different stages of cleanup - the
SSTables may still exist or they may have been cleaned up already, and
we want to make sure the state is loaded correctly.
We adjust all of the simple cases of cluster tests so they work
with `rf_rack_valid_keyspaces: true`. It boils down to assigning
nodes to multiple racks. For most of the changes, we do that by:
* Using `pytest.mark.prepare_3_racks_cluster` instead of
`pytest.mark.prepare_3_nodes_cluster`.
* Using an additional argument -- `auto_rack_dc` -- when calling
`ManagerClient::servers_add()`.
In some cases, we need to assign the racks manually, which may be
less obvious, but in every such situation, the tests didn't rely
on that assignment, so that doesn't affect them or what they verify.
The new option deprecates the existing `enable_tablets` option.
It will be extended in the next patch with a 3rd value: "enforced"
while will enable tablets by default for new keyspace but
without the posibility to opt out using the `tablets = {'enabled':
false}` keyspace schema option.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Now that we support suite subfolders, there is no
need to create an own suite for object_store and auth_cluster, topology, topology_custom.
this PR merge all these folders into one: 'cluster"
this pr also introduce and apply 'prepare_3_nodes_cluster' fixture that allow preparing non-dirty 3 nodes cluster
that can be reused between tests(for tests that was in topology folder)
number of tests in master
release -3461
dev -3472
debug -3446
number of tests in this PR
release -3460
dev -3471
debug -3445
There is a minus one test in each mode because It was 2 test_topology_failure_recovery files(topology and topology_custom) with the same utility functions but different test cases. This PR merged them into one
Closesscylladb/scylladb#22917
* github.com:scylladb/scylladb:
test.py: merge object_store into cluster folder
test.py: merge auth_cluster into cluster folter
test.py: rename topology_custom folder to cluster
test.py: merge topology test suite into topology_custom
test.py delete conftest in topology_custom
test.py apply prepare_3_nodes_cluster in topology
test.py: introduce prepare_3_nodes_cluster marker