Files
scylladb/service
Kamil Braun 5acfcd8ef5 Merge 'raft: send group0 RPCs only if the destination group0 server is seen as alive' from Piotr Dulikowski
In topology on raft mode, the events "new node starts its group0 server"
and "new node is added to group0 configuration" are not synchronized
with each other. Therefore it might happen that the cluster starts
sending commands to the new node before the node starts its server. This
might lead to harmless, but ugly messages like:

    INFO  2023-09-27 15:42:42,611 [shard 0:stat] rpc - client
    127.0.0.1:56352 msg_id 2:  exception "Raft group
    b8542540-5d3b-11ee-99b8-1052801f2975 not found" in no_wait handler
    ignored

In order to solve this, the failure detector verb is extended to report
information about whether group0 is alive. The raft rpc layer will drop
messages to nodes whose group0 is not seen as alive.

Tested by adding a delay before group0 is started on the joining node,
running all topology tests and grepping for the aforementioned log
messages.

Fixes: scylladb/scylladb#15853
Fixes: scylladb/scylladb#15167

Closes scylladb/scylladb#16071

* github.com:scylladb/scylladb:
  raft: rpc: introduce destination_not_alive_error
  raft: rpc: drop RPCs if the destination is not alive
  raft: pass raft::failure_detector to raft_rpc
  raft: transfer information about group0 liveness in direct_fd_ping
  raft: add server::is_alive
2023-11-24 10:34:05 +01:00
..
2023-06-06 13:29:16 +03:00