Commit Graph

6 Commits

Author SHA1 Message Date
Petr Gusev
bc98774f83 test_replace_reuse_ip: add data plane load
In this commit we enhance test_replace_reuse_ip
to reproduce #17421. We create a test table and run
insert queries on it while the first node is
being replaced. In this form the test fails
without the fix from the previous commit. Some
insert requests fail with [Unavailable exception]
"Cannot achieve consistency level for cl QUORUM...".
2024-04-24 16:59:24 +03:00
Petr Gusev
aeed5c5fe3 test_replace_reuse_ip: check other servers see the IP
The replaced node transitions to LEFT state, and
we used to remove the IPs of such nodes from gossiper.
If we replace with same IP, this caused the IP of the
new node to be removed from gossiper.

This problem was fixed by #16820, this commit
adds a regression test for it.

closes #15967
2024-02-06 13:28:04 +04:00
Petr Gusev
1e00889842 test_replace_different_ip: check old IP is removed from gossiper
In this commit we modify the existing
test_replace_different_ip. We add the check that the old
IP is not contained in alive or down lists, which
means it's completely wiped from gossiper. This test is failing
without the force_remove_endpoint fix from
a previous commit. We also check that the state of
local system.peers table is correct.
2024-01-19 20:36:52 +04:00
Patryk Jędrzejczak
a8513bd41b test: replace multiple server_add calls with servers_add
ManagerClient.servers_add can be used in every test that uses
consistent topology changes. We replace all multiple server_add
calls in such tests with a single servers_add call to make these
tests faster and simplify their code. Additionally, these
servers_add calls will test concurrent bootstraps for free.
2024-01-02 12:19:33 +01:00
Patryk Jędrzejczak
206a446a02 test: decrease failure_detector_timeout_in_ms in replace tests
In one of the previous commits, we have made
ManagerClient.server_add wait until all running nodes see the node
being replaced as dead. Unfortunately, the waiting time can be
around 20 s if we stop the node being replaced ungracefully. 20 s
is the default value of the failure detector timeout.

We don't want to slow down the replace operations this much for no
good reason. We could use server_stop_gracefully instead of
server_stop everywhere, but we should have at least a few replace
tests with server_stop. For now, test_replace and
test_raft_ignore_nodes will be these tests. To keep them reasonably
fast, we decrease the failure_detector_timeout_in_ms value on all
initial servers.

We also skip test_replace in debug mode to avoid flakiness due to
low failure_detector_timeout_in_ms (test_raft_ignore_nodes is
already skipped).
2023-11-21 12:39:16 +01:00
Patryk Jędrzejczak
7062ff145e test: move test_replace to topology_custom
In the following commit, we make all servers in test_replace use
failure-detector-timeout-in-ms = 2000. Therefore, we need
test_replace to be in a suite with initial_size equal to 0.
2023-11-21 12:39:16 +01:00