scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-01 04:26:48 +00:00

Files

Kamil Braun bbcf8305bb storage_service: don't calculate ignore_nodes before waiting for normal handlers

Before this commit the `wait_for_normal_state_handled_on_boot` would
wait for a static set of nodes (`sync_nodes`), calculated using the
`get_nodes_to_sync_with` function and `parse_node_list`; the latter was
used to obtain a list of "nodes to ignore" (for replace operation) and
translate them, using `token_metadata`, from IP addresses to Host IDs
and vice versa. `sync_nodes` was also used in `_gossiper.wait_alive` call
which we do after `wait_for_normal_state_handled_on_boot`.

Recently we started doing these calculations and this wait very early in
the boot procedure - immediately after we start gossiping
(50e8ec77c6).

Unfortunately, as always with gossiper, there are complications.
In #14468 and #14487 two problems were detected:
- Gossiper may contain obsolete entries for nodes which were recently
  replaced or changed their IPs. These entries are still using status
  `NORMAL` or `shutdown` (which is treated like `NORMAL`, e.g.
  `handle_state_normal` is also called for it). The
  `_gossiper.wait_alive` call would wait for those entries too and
  eventually time out.
- Furthermore, by the time we call `parse_node_list`, `token_metadata`
  may not be populated yet, which is required to do the IP<->Host ID
  translations -- and populating `token_metadata` happens inside
  `handle_state_normal`, so we have a chicken-and-egg problem here.

The `parse_node_list` problem is solved in this commit. It turns out
that we don't need to calculate `sync_nodes` (and hence `ignore_nodes`)
in order to wait for NORMAL state handlers. We can wait for handlers to
finish for *any* `NORMAL`/`shutdown` entries appearing in gossiper, even
those that correspond to dead/ignored nodes and obsolete IPs.
`handle_state_normal` is called, and eventually finishes, for all of
them. `wait_for_normal_state_handled_on_boot` no longer receives a set
of nodes as parameter and is modified appropriately, it's now
calculating the necessary set of nodes on each retry (the set may shrink
while we're waiting, e.g. because an entry corresponding to a node that
was replaced is garbage-collected from gossiper state).

Thanks to this, we can now put the `sync_nodes` calculation (which is
still necessary for `_gossiper.wait_alive`), and hence the
`parse_node_list` call, *after* we wait for NORMAL state handlers,
solving the chickend-and-egg problem.

This addresses the immediate failure described in #14487, but the test
will still fail. That's because `_gossiper.wait_alive` may still receive
a too large set of nodes -- we may still include obsolete IPs or entries
corresponding to replaced nodes in the `sync_nodes` set. We fix this
in the following commit which will solve both issues.

2023-07-06 10:24:44 +02:00

broadcast_tables/experimental

treewide: do not define/capture unused variables

2023-02-15 22:57:18 +02:00

pager

Merge 'Optimize topology::compare_endpoints' from Benny Halevy

2023-03-07 15:17:19 +02:00

paxos

lwt: do not destroy capture in upgrade_if_needed lambda since the lambda is used more then once

2023-02-26 16:13:16 +02:00

qos

code: Switch to seastar API level 7