db/hints: Initialize endpoint managers only for valid hint directories

Before these changes, it could happen that Scylla initialized
endpoint managers for hint directories representing

* host IDs before migrating hinted handoff to using host IDs,
* IP addresses after the migration.

One scenario looked like this:

1. Start Scylla and upgrade the cluster to using host IDs.
2. Create, by hand, a hint directory representing an IP address.
3. Trigger changing the host filter in hinted handoff; it could
   be achieved by, for example, restricting the set of data
   centers Scylla is allowed to save hints for.

When changing the host filter, we browse the hint directories
and create endpoint managers if we can send hints towards
the node corresponding to a given hint directory. We only
accepted hint directories representing IP addresses
and host IDs. However, we didn't check whether the local node
has already been upgraded to host-ID-based hinted handoff
or not. As a result, endpoint managers were created for
both IP addresses and host IDs, no matter whether we were
before or after the migration.

These changes make sure that any time we browse the hint
directories, we take that into account.

Fixes scylladb/scylladb#19172

Closes scylladb/scylladb#19173
This commit is contained in:
Dawid Medrek
2024-06-07 17:13:27 +02:00
committed by Piotr Dulikowski
parent 3cfb0503a9
commit 2446cce272
3 changed files with 31 additions and 6 deletions

View File

@@ -547,10 +547,16 @@ future<> manager::change_host_filter(host_filter filter) {
const auto maybe_host_id_and_ip = std::invoke([&] () -> std::optional<pair_type> {
try {
locator::host_id_or_endpoint hid_or_ep{de.name};
if (hid_or_ep.has_host_id()) {
// If hinted handoff is host-ID-based, hint directories representing IP addresses must've
// been created by mistake and they're invalid. The same for pre-host-ID hinted handoff
// -- hint directories representing host IDs are NOT valid.
if (hid_or_ep.has_host_id() && _uses_host_id) {
return std::make_optional(pair_type{hid_or_ep.id(), hid_or_ep.resolve_endpoint(*tmptr)});
} else {
} else if (hid_or_ep.has_endpoint() && !_uses_host_id) {
return std::make_optional(pair_type{hid_or_ep.resolve_id(*tmptr), hid_or_ep.endpoint()});
} else {
return std::nullopt;
}
} catch (...) {
return std::nullopt;
@@ -712,8 +718,6 @@ future<> manager::with_file_update_mutex_for(const std::variant<locator::host_id
return _ep_managers.at(host_id).with_file_update_mutex(std::move(func));
}
// The function assumes that if `_uses_host_id == true`, then there are no directories that represent IP addresses,
// i.e. every directory is either valid and represents a host ID, or is invalid (so it should be ignored anyway).
future<> manager::initialize_endpoint_managers() {
auto maybe_create_ep_mgr = [this] (const locator::host_id& host_id, const gms::inet_address& ip) -> future<> {
if (!check_dc_for(host_id)) {
@@ -747,12 +751,23 @@ future<> manager::initialize_endpoint_managers() {
}
if (_uses_host_id) {
// If hinted handoff is host-ID-based but the directory doesn't represent a host ID,
// it's invalid. Ignore it.
if (!maybe_host_id_or_ep->has_host_id()) {
co_return;
}
// If hinted handoff is host-ID-based, `get_ep_manager` will NOT use the passed IP address,
// so we simply pass the default value there.
co_return co_await maybe_create_ep_mgr(maybe_host_id_or_ep->id(), gms::inet_address{});
}
// If we have got to this line, hinted handoff is still IP-based and we need to map the IP.
if (!maybe_host_id_or_ep->has_endpoint()) {
// If the directory name doesn't represent an IP, it's invalid. We ignore it.
co_return;
}
const auto maybe_host_id = std::invoke([&] () -> std::optional<locator::host_id> {
try {
return maybe_host_id_or_ep->resolve_id(*tmptr);

View File

@@ -323,6 +323,10 @@ public:
void update_backlog(size_t backlog, size_t max_backlog);
bool uses_host_id() const noexcept {
return _uses_host_id;
}
private:
bool stopping() const noexcept {
return _state.contains(state::stopping);

View File

@@ -148,10 +148,16 @@ void space_watchdog::on_timer() {
auto maybe_variant = std::invoke([&] () -> std::optional<std::variant<locator::host_id, gms::inet_address>> {
try {
const auto hid_or_ep = locator::host_id_or_endpoint{de.name};
if (hid_or_ep.has_host_id()) {
// If hinted handoff is host-ID-based, hint directories representing IP addresses must've
// been created by mistake and they're invalid. The same for pre-host-ID hinted handoff
// -- hint directories representing host IDs are NOT valid.
if (hid_or_ep.has_host_id() && shard_manager.uses_host_id()) {
return std::variant<locator::host_id, gms::inet_address>(hid_or_ep.id());
} else {
} else if (hid_or_ep.has_endpoint() && !shard_manager.uses_host_id()) {
return std::variant<locator::host_id, gms::inet_address>(hid_or_ep.endpoint());
} else {
return std::nullopt;
}
} catch (...) {
return std::nullopt;