test: fix topology_custom/test_raft_recovery_stuck flakiness

The test performs consecutive schema changes in RECOVERY mode. The
second change relies on the first. However the driver might route the
changes to different servers and we don't have group 0 to guarantee
linearizability. We must rely on the first change coordinator to push
the schema mutations to other servers before returning, but that only
happens when it sees other servers as alive when doing the schema
change. It wasn't guaranteed in the test. Fix this.

Fixes scylladb/scylladb#20791

Should be backported to all branches containing this test to reduce
flakiness.

Closes scylladb/scylladb#20792
This commit is contained in:
Kamil Braun
2024-09-24 13:13:33 +02:00
committed by Botond Dénes
parent d16ea0afd6
commit 69b4769418

View File

@@ -79,6 +79,10 @@ async def test_recover_stuck_raft_recovery(request, manager: ManagerClient):
logging.info(f"Restarting {others}")
await manager.rolling_restart(others)
# Prevent scylladb/scylladb#20791
logging.info(f"Wait until {srv1} sees {others} as alive")
await manager.server_sees_others(srv1.server_id, len(others))
logging.info(f"{others} restarted, waiting until driver reconnects to them")
hosts = await wait_for_cql_and_get_hosts(cql, others, time.time() + 60)