From a6236072eefe67e1f2810b5ec3d68fc09b393992 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Patryk=20J=C4=99drzejczak?= Date: Thu, 19 Oct 2023 16:12:30 +0200 Subject: [PATCH] raft topology: join_node_request_handler: wait until first node becomes normal We need to wait until the first node becomes normal in `join_node_request_handler` to ensure that joining nodes are not handled as the first node in the cluster. If we placed a join request before the first node becomes normal, the topology coordinator would incorrectly skip the join node handshake in `handle_node_transition` (`case node_state::none`). It would happen because the topology coordinator decides whether a node is the first in the cluster by checking if there are no normal nodes. Therefore, we must ensure at least one normal node when the topology coordinator handles a join request for a non-first node. We change the previous check because it can return true if there are no normal nodes. `topology::is_empty` would also return false if the first node was still new or in transition. Additionally, calling `join_node_request_handler` before the first node sets itself as normal is frequent during concurrent bootstrap, so we remove "unlikely" from the comment. Fixes: scylladb/scylladb#15807 Closes scylladb/scylladb#15775 --- service/storage_service.cc | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/service/storage_service.cc b/service/storage_service.cc index 293b34d0ad..c6d7d4a523 100644 --- a/service/storage_service.cc +++ b/service/storage_service.cc @@ -6345,11 +6345,13 @@ future storage_service::join_node_request_handler(join co_await _topology_state_machine.event.when([this] { // The first node defines the cluster and inserts its entry to the - // `system.topology` without checking anything. It is unlikely but - // possible that the `join_node_request_handler` fires before the first - // node inserts its entry, therefore we might need to wait - // until that happens, here. - return !_topology_state_machine._topology.is_empty(); + // `system.topology` without checking anything. It is possible that the + // `join_node_request_handler` fires before the first node sets itself + // as a normal node, therefore we might need to wait until that happens, + // here. If we didn't do it, the topology coordinator could handle the + // joining node as the first one and skip the necessary join node + // handshake. + return !_topology_state_machine._topology.normal_nodes.empty(); }); auto& g0_server = _group0->group0_server();