scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-09 16:33:35 +00:00

Files

Kamil Braun 060f2de14e Merge 'Cluster features on raft: new procedure for joining group 0' from Piotr Dulikowski

This PR implements a new procedure for joining nodes to group 0, based on the description in the "Cluster features on Raft (v2)" document. This is a continuation of the previous PRs related to cluster features on raft (https://github.com/scylladb/scylladb/pull/14722, https://github.com/scylladb/scylladb/pull/14232), and the last piece necessary to replace cluster feature checks in gossip.

Current implementation relies on gossip shadow round to fetch the set of enabled features, determine whether the node supports all of the enabled features, and joins only if it is safe. As we are moving management of cluster features to group 0, we encounter a problem: the contents of group 0 itself may depend on features, hence it is not safe to join it unless we perform the feature check which depends on information in group 0. Hence, we have a dependency cycle.

In order to solve this problem, the algorithm for joining group 0 is modified, and verification of features and other parameters is offloaded to an existing node in group 0. Instead of directly asking the discovery leader to unconditionally add the node to the configuration with `GROUP0_MODIFY_CONFIG`, two different RPCs are added: `JOIN_NODE_REQUEST` and `JOIN_NODE_RESPONSE`. The main idea is as follows:

- The new node sends `JOIN_NODE_REQUEST` to the discovery leader. It sends a bunch of information describing the node, including supported cluster features. The discovery leader verifies some of the parameters and adds the node in the `none` state to `system.topology`.
- The topology coordinator picks up the request for the node to be joined (i.e. the node in `none` state), verifies its properties - including cluster features - and then:
	- If the node is accepted, the coordinator transitions it to `boostrap`/`replace` state and transitions the topology to `join_group0` state. The node is added to group 0 and then `JOIN_NODE_RESPONSE` is sent to it with information that the node was accepted.
	- Otherwise, the node is moved to `left` state, told by the coordinator via `JOIN_NODE_RESPONSE` that it was rejected and it shuts down.

The procedure is not retryable - if a node fails to do it from start to end and crashes in between, it will not be allowed to retry it with the same host_id - `JOIN_NODE_REQUEST` will fail. The data directory must be cleared before attempting to add it again (so that a new host_id is generated).

More details about the procedure and the RPC are described in `topology-over-raft.md`.

Fixes: #15152

Closes scylladb/scylladb#15196

* github.com:scylladb/scylladb:
  tests: mark test_blocked_bootstrap as skipped
  storage_service: do not check features in shadow round
  storage_service: remove raft_{boostrap,replace}
  topology_coordinator: relax the check in enable_features
  raft_group0: insert replaced node info before server setup
  storage_service: use join node rpc to join the cluster
  topology_coordinator: handle joining nodes
  topology_state_machine: add join_group0 state
  storage_service: add join node RPC handlers
  raft: expose current_leader in raft::server
  storage_service: extract wait_for_live_nodes_timeout constant
  raft_group0: abstract out node joining handshake
  storage_service: pass raft_topology_change_enabled on rpc init
  rpc: add new join handshake verbs
  docs: document the new join procedure
  topology_state_machine: add supported_features to replica_state
  storage_service: check destination host ID in raft verbs
  group_state_machine: take reference to raft address map
  raft_group0: expose joined_group0

2023-09-28 11:45:09 +02:00