raft topology: join_node_request_handler: wait until first node becomes normal

We need to wait until the first node becomes normal in
`join_node_request_handler` to ensure that joining nodes are not
handled as the first node in the cluster.

If we placed a join request before the first node becomes normal,
the topology coordinator would incorrectly skip the join node
handshake in `handle_node_transition` (`case node_state::none`).
It would happen because the topology coordinator decides whether
a node is the first in the cluster by checking if there are no
normal nodes. Therefore, we must ensure at least one normal node
when the topology coordinator handles a join request for a
non-first node.

We change the previous check because it can return true if there
are no normal nodes. `topology::is_empty` would also return false
if the first node was still new or in transition.

Additionally, calling `join_node_request_handler` before the first
node sets itself as normal is frequent during concurrent bootstrap,
so we remove "unlikely" from the comment.

Fixes: scylladb/scylladb#15807

Closes scylladb/scylladb#15775
This commit is contained in:
Patryk Jędrzejczak
2023-10-19 16:12:30 +02:00
committed by Avi Kivity
parent 227136ddf5
commit a6236072ee

View File

@@ -6345,11 +6345,13 @@ future<join_node_request_result> storage_service::join_node_request_handler(join
co_await _topology_state_machine.event.when([this] {
// The first node defines the cluster and inserts its entry to the
// `system.topology` without checking anything. It is unlikely but
// possible that the `join_node_request_handler` fires before the first
// node inserts its entry, therefore we might need to wait
// until that happens, here.
return !_topology_state_machine._topology.is_empty();
// `system.topology` without checking anything. It is possible that the
// `join_node_request_handler` fires before the first node sets itself
// as a normal node, therefore we might need to wait until that happens,
// here. If we didn't do it, the topology coordinator could handle the
// joining node as the first one and skip the necessary join node
// handshake.
return !_topology_state_machine._topology.normal_nodes.empty();
});
auto& g0_server = _group0->group0_server();