storage_service: topology coordinator: do not retry the metadata barrier forever in write_both_read_new state

Handle the barrier failure by sleeping for a "ring delay" and
continuing. The purpose of the barrier is to wait for all reads to
old replica set to complete and fence the remaining requests.  If the
barrier fails we give the fence some time to propagate and continue with
the topology change. Of fence did not propagate we may have stale reads,
but this is not worse that we have with gossiper.
This commit is contained in:
Gleb Natapov
2023-11-06 13:47:23 +02:00
parent 7ea8fa459c
commit 7267376eac

View File

@@ -2063,7 +2063,7 @@ class topology_coordinator {
break;
case topology::transition_state::write_both_read_new: {
auto node = get_node_to_work_on(std::move(guard));
bool barrier_failed = false;
// In this state writes goes to old and new replicas but reads start to be done from new replicas
// Before we stop writing to old replicas we need to wait for all previous reads to complete
try {
@@ -2076,7 +2076,15 @@ class topology_coordinator {
slogger.error("raft topology: transition_state::write_both_read_new, "
"global_token_metadata_barrier failed, error {}",
std::current_exception());
break;
barrier_failed = true;
}
if (barrier_failed) {
// If barrier above failed it means there may be unfenced reads from old replicas.
// Lets wait for the ring delay for those writes to complete or fence to propagate
// before continuing.
// FIXME: nodes that cannot be reached need to be isolated either automatically or
// by an administrator
co_await sleep_abortable(_ring_delay, _as);
}
switch(node.rs->state) {
case node_state::bootstrapping: {