Commit Graph

231 Commits

Author SHA1 Message Date
Dawid Medrek
d065d6f05d db/hints: Log when ignoring invalid hint directories
In 58784cd, aa4b06a and other commits migrating
hinted handoff from IPs to host IDs (scylladb/scylladb#15567),
we started ignoring hint directories of invalid names,
i.e. those that represent neither an IP address, nor a host ID.
They remain on disk and are taken into account while computing
e.g. the total size of hints, but they're not used in any way.

These changes add logs informing the user when Scylla
encounters such a directory.

Closes scylladb/scylladb#17566

(cherry picked from commit a5528a2093)

Closes scylladb/scylladb#19892
2024-08-07 10:55:06 +02:00
Michael Litvak
df0503afd6 db/hints: migrate sync point to host ID
Change the format of sync points to use host ID instead of IPs, to be
consistent with the use of host IDs in hinted handoff module.
Introduce sync point v3 format which is the same as v2 except it stores
host IDs instead of IPs.
The encoding of sync points now always uses the new v3 format with host
IDs.
The decoding supports both formats with host IDs and IPs, so a sync point
contains now a variant of either types, and in the case of the new
format the translation from IP to host ID is avoided.
2024-07-31 18:00:28 +02:00
Dawid Medrek
7201efc2f2 db/hints: Initialize endpoint managers only for valid hint directories
Before these changes, it could happen that Scylla initialized
endpoint managers for hint directories representing

* host IDs before migrating hinted handoff to using host IDs,
* IP addresses after the migration.

One scenario looked like this:

1. Start Scylla and upgrade the cluster to using host IDs.
2. Create, by hand, a hint directory representing an IP address.
3. Trigger changing the host filter in hinted handoff; it could
   be achieved by, for example, restricting the set of data
   centers Scylla is allowed to save hints for.

When changing the host filter, we browse the hint directories
and create endpoint managers if we can send hints towards
the node corresponding to a given hint directory. We only
accepted hint directories representing IP addresses
and host IDs. However, we didn't check whether the local node
has already been upgraded to host-ID-based hinted handoff
or not. As a result, endpoint managers were created for
both IP addresses and host IDs, no matter whether we were
before or after the migration.

These changes make sure that any time we browse the hint
directories, we take that into account.

Fixes scylladb/scylladb#19172

(cherry picked from commit c9bb0a4da6)

Closes scylladb/scylladb#19426
2024-06-23 19:32:57 +03:00
Dawid Medrek
fc3d2d8fde db/hints: Introduce an error injection to test draining
We want to verify that a hint directory is drained
when any of the nodes correspodning to it leaves
the cluster. The test scenario should happen before
the whole cluster has been migrated to
the host-ID-based hinted handoff, so when we still
rely on the mappings between hint endpoint managers
and the hint directories managed by them.

To make such a test possible, in these changes we
introduce an error injection rejecting incoming
hints. We want to test a scenario when:

1. hints are saved towards a given node -- node N1,
2. N1 changes its IP to a different one,
3. some other node -- node N2 -- changes its IP
   to the original IP of N1,
4. hints are saved towards N2 and they are stored
   in the same directory as the hints saved towards
   N1 before,
5. we start draining N2.

Because at some point N2 needs to be stopped,
it may happen that some mutations towards
a distributed system table generate a hint
to N2 BEFORE it has finished changing its IP,
effectively creating another hint directory
where ALL of the hints towards the node
will be stored from there on. That would disturb
the test scenario. Hence, this error injection is
necessary to ensure that all of the steps in the
test proceed as expected.

(cherry picked from commit e855794327)
2024-06-04 14:42:09 +00:00
Dawid Medrek
82d635b6a7 db/hints: Ensure that draining happens
Before hinted handoff is migrated to using host IDs
to identify nodes in the cluster, we keep track
of mappings between hint endpoint managers
identified by host IDs and the hint directories
managed by them and represented by IP addresses.
As a consequence, it may happen that one hint
directory corresponds to multiple nodes
-- it's intended. See 64ba620 for more details.

Before these changes, we only started the draining
process of a hint directory if the node leaving
the cluster corresponded to that hint directory
AND was identified by the same host ID as
the hint endpoint manager managing that directory.
As a result, the draining did not always happen
when it was supposed to.

Draining should start no matter which of the nodes
corresponding to a hint directory is leaving
the cluster. This commit ensures that it happens.

(cherry picked from commit 745a9c6ab8)
2024-06-04 14:42:08 +00:00
Pavel Emelyanov
b24fb8dc87 inet_address: Remove to_sstring() in favor of fmt::to_string
The existing inet_address::to_string() calls fmt::format("{}", *this)
anyway. However, the to_string() method is declared in .cc file, while
form formatter is in the header and is equipeed with constexprs so
that converting an address to string is done as much as possible
compile-time.

Also, though minor, fmt::to_string(foo) is believed to be even faster
than fmt::format("{}", foo).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#18712
2024-05-21 09:43:08 +03:00
Dawid Medrek
c9bbb92b1a db/hints: Remove migrating flag before initializing endpoint managers
Before these changes, if initializing endpoint
managers after the migration of hinted handoff
to host ID is done throws an exception, we
don't remove the flag indicating the migration
is still in progress. However, the migration
has, in practice, finished -- all of the
hint directories have been mapped to host IDs
and all of the nodes in the cluster are
host-ID-based. Because of that, it makes sense
to remove the flag early on.
2024-05-13 16:40:47 +02:00
Dawid Medrek
bdcde0c210 db/hints: Prevent segmentation fault when initializing endpoint managers
If hinted handoff is still IP-based and there is
a hint directory representing an IP without
a corresponding mapping to a host ID in
`locator::token_metadata`, an attemp to initialize
its endpoint manager will result in a segmentation
fault. This commit prevents that.
2024-05-13 16:40:47 +02:00
Kefu Chai
2a9a874e19 db,service: fix typos in comments
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18567
2024-05-09 08:26:44 +03:00
Dawid Medrek
46ab22f805 db/hints: Add endpoint_downtime_not_bigger_than()
We add an auxiliary function checking if a node
hasn't been down for too long. Although
`gms::gossiper` provides already exposes a function
responsible for that, it requires that its
argument be an IP address. That's the reason
we add a new function.
2024-04-28 01:22:59 +02:00
Dawid Medrek
0ef8d67d32 db/hints: Migrate hinted handoff when cluster feature is enabled
These changes migrate hinted handoff to using
host ID as soon as the corresponding cluster
feature is enabled.

When a node starts, it defaults to creating
directories naming them after IP addresses.
When the whole cluster has upgraded
to a version of Scylla that can handle
directories representing host IDs,
we perform a migration of the IP folders,
i.e. we try to rename them to host IDs.
Invalid directories, i.e. those that
represent neither an IP address, nor a host
ID, are removed.

During the migration, hinted handoff is
disabled. It is necessary because we have
to modify the disk's contents, so new hints
cannot be saved until the migration finishes.
2024-04-28 01:22:57 +02:00
Dawid Medrek
58784cd8db db/hints: Handle arbitrary directories in resource manager
Before these changes, resource manager only handled
the case when directories it browsed represented
valid host IDs. However, since before migrating
hinted handoff to using host IDs we still name
directories after IP addresses, that would lead
to exceptins that shouldn't happen.

We make resource manager handle directories
of arbitrary names correctly.
2024-04-27 22:31:07 +02:00
Dawid Medrek
ee84e810ca db/hints: Start using hint_directory_manager
We start keeping track of mappings IP - host ID.
The mappings are between endpoint managers
(identified by host IDs) and the hint directories
managed by them (represented by IP addresses).

This is a prelude to handling IP directories
by the hint shard manager.

The structure should only be used by the hint
manager before it's migrated to using host IDs.
The reason for that is that we rely on the
information obtained from the structure, but
it might not make sense later on.

When we start creating directories named after
host IDs and there are no longer directories
representing IP addresses, there is no relation
between host IDs and IPs -- just because
the structure is supposed to keep track between
endpoint managers and hint directories that
represent IP addresses. If they represent
host IDs, the connection between the two
is lost.

Still using the data structure could lead
to bugs, e.g. if we tried to associate
a given endpoint manager's host ID with its
corresponding IP address from
locator::token_metadata, it could happen that
two different host IDs would be bound to
the same IP address by the data structure:
node A has IP I1, node A changes its IP to I2,
node B changes its IP to I1. Though nodes
A and B have different host IDs (because they
are unique), the code would try to save hints
towards node B in node A's hint directory,
which should NOT happen.

Relying on the data structure is thus only
safe before migrating hinted handoff to using
host IDs. It may happen that we save a hint
in the hint directory of the wrong node indeed,
but since migration to using host IDs is
a process that only happens once, it's a price
we are ready to pay. It's only imperative to
prevent it from happening in normal
circumstances.
2024-04-27 22:31:07 +02:00
Dawid Medrek
aa4b06a895 db/hints: Enforce providing IP in get_ep_manager()
We drop the default argument in the function's signature.
Also, we adjust the code of change_host_filter() to
be able to perform calls to get_ep_manager().
2024-04-27 22:31:07 +02:00
Dawid Medrek
934e4bb45e db/hints: Add function for migrating hint directories to host ID
We add a function that will be used while
migrating hinted handoff to using host IDs.
It iterates over existing hint directories
and tries to rename them to the corresponding
host IDs. In case of a failure, we remove
it so that at the end of its execution
the only remaining directories are those
that represent host IDs.
2024-04-27 22:31:04 +02:00
Dawid Medrek
e36f853f9b db/hints: Take both IP and host ID when storing hints
The store_hint() method starts taking both an IP
and a host ID as its arguments. The rationale
for the change is depending on the stage of
the cluster (before an upgrade to the
host-ID-based hinted handdof and after it),
we might need to create a directory representing
either an IP address, or a host ID.

Because locator::topology can change in the
before obtaining the host ID we pass
and when the function is being executed,
we need to pass both parameters explicitly
to ensure the consistency between them.
2024-04-27 20:35:58 +02:00
Dawid Medrek
063d4d5e91 db/hints: Prepare initializing endpoint managers for migrating from IP to host ID
We extract the initialization of endpoint managers
from the start method of the hint manager
to a separate function and make it handle directories
that represent either IP addresses, or host IDs;
other directories are ignored.

It's necessary because before Scylla is upgraded
to a version that uses host-ID-based hinted handoff,
we need to continue only managing IP directories.
When Scylla has been upgraded, we will need to handle
host ID directories.

It may also happen that after an upgrade (but not
before it), Scylla fails while renaming
the directories, so we end up with some of them
representing IP address, and some representing
host IDs. After these changes, the code handles
that scenario as well.
2024-04-27 20:35:53 +02:00
Dawid Medrek
cfd03fe273 db/hints: Migrate to locator::host_id
We change the type of node identifiers
used within the module and fix compilation.
Directories storing hints to specific nodes
are now represented by host IDs instead of
IPs.
2024-04-26 22:44:04 +02:00
Dawid Medrek
c585444c60 db/hints: Fix indentation 2024-04-26 22:44:03 +02:00
Kefu Chai
a439ebcfce treewide: include fmt/ranges.h and/or fmt/std.h
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we include `fmt/ranges.h` and/or `fmt/std.h`
for formatting the container types, like vector, map
optional and variant using {fmt} instead of the homebrew
formatter based on operator<<.
with this change, the changes adding fmt::formatter and
the changes using ostream formatter explicitly, we are
allowed to drop `FMT_DEPRECATED_OSTREAM` macro.

Refs scylladb#13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-04-19 22:56:16 +08:00
Dawid Medrek
b36becc1f3 db/hints: Fix too_many_in_flight_hints_for
The semantics of the function was accidentally
modified in 6e79d64. The consequence of the change
was that we didn't limit memory consumption:
the function always returned false for any node
different from the local node. The returned value
is used by storage_proxy to decide whether it
is able to store a hint or not.

This commit fixes the problem by taking other
nodes into consideration again.

Fixes #17636

Closes scylladb/scylladb#17639
2024-03-06 09:48:30 +02:00
Pavel Emelyanov
7c5c89ba8d Revert "Merge 'Use utils::directories instead of db::config to get dirs' from Patryk Wróbel"
This reverts commit 370fbd346c, reversing
changes made to 0912d2a2c6.

This makes scylla-manager mis-interpret the data_file_directories
somehow, issue #17078
2024-01-31 15:08:14 +03:00
Patryk Wrobel
781a6a5071 utils/directories: make utils::directories::set an internal type
Previously, utils::directories::set could have been used by
clients of utils::directories class to provide dirs for creation.
Due to moving the responsibility for providing paths of dirs from
db::config to utils::directories, such usage is no longer the case.

This change:
 - defines utils::directories::set in utils/directories.cc to disallow
   its usage by the clients of utils::directories
 - makes utils::directories::create_and_verify() member function
   private; now it is used only by the internals of the class
 - introduces a new member function to utils::directories called
   create_and_verify_sharded_directory() to limit the functionality
   provided to clients

Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
2024-01-29 13:20:41 +01:00
Kefu Chai
be364d30fd db: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16664
2024-01-09 11:44:19 +02:00
Benny Halevy
6e79d647e6 db/hints/manager: use locator::topology rather than fb_utilities
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-05 08:42:49 +02:00
Dawid Medrek
ddc385bce0 db/hints: Remove an unused namespace 2023-10-06 13:25:30 +02:00
Dawid Medrek
76d414012b db/hints: Coroutinize change_host_filter() 2023-10-06 13:25:30 +02:00
Dawid Medrek
09eb30e6f1 db/hints: Coroutinize drain_for()
This commit turns the function into a coroutine
and makes the code less compact and more readable.
2023-10-06 13:25:30 +02:00
Dawid Medrek
907a572e24 db/hints: Clean up can_hint_for()
This commit gets rid of unnecessary additional calls to functions
and makes all lines abide by the limit of 120 characters.
2023-10-06 13:25:30 +02:00
Dawid Medrek
596e1f9859 db/hints: Clean up store_hint()
This commit makes the function abide by the limit
of 120 characters per line.
2023-10-06 13:25:30 +02:00
Dawid Medrek
8a43f94ca6 db/hints: Clean up too_many_in_flight_hints_for()
This commit makes the return statement more readable.
It also makes the comment abide by the limit of 120 characters per line.
2023-10-06 13:25:30 +02:00
Dawid Medrek
96a5906621 db/hints: Refactor get_ep_manager() 2023-10-06 13:25:30 +02:00
Dawid Medrek
8b591be3c3 db/hints: Coroutinize wait_for_sync_point()
This commit coroutinizes the function and adds
a comment explaining a non-trivial case.
2023-10-06 13:25:27 +02:00
Dawid Medrek
fee3aafd80 db/hints: Use std::span in calculate_current_sync_point
std::span is a lot more flexible than std::vector as it allows
for arbitrary contiguous ranges.
2023-10-06 12:36:05 +02:00
Dawid Medrek
64fd4d6323 db/hints: Clean up manager::forbid_hints_for_eps_with_pending_hints() 2023-10-06 12:26:55 +02:00
Dawid Medrek
58cd5c4167 db/hints: Clean up manager::forbid_hints() 2023-10-06 12:26:55 +02:00
Dawid Medrek
f8ed93f5bc db/hints: Clean up manager::allow_hints() 2023-10-06 12:26:52 +02:00
Dawid Medrek
bfe32bcf89 db/hints: Coroutinize compute_hints_dir_device_id() 2023-10-06 12:18:30 +02:00
Dawid Medrek
8f28eb6522 db/hints: Clean up manager::stop()
This commit gets rid of boilerplate in the function,
leverages a range pipe and explicit types to make
the code more readable, and changes the logs to
make it clearer what happens.
2023-10-06 12:18:30 +02:00
Dawid Medrek
a384caece0 db/hints: Clean up manager::start()
This commit coroutinizes the function and makes it less compact.
2023-10-06 12:18:30 +02:00
Dawid Medrek
2db97aaf81 db/hints/manager: Clean up the constructor
fmt::to_string should be preferred to seastar::format.
It's clearer and simpler. Besides that, this commit makes
the code abide by the limit of 120 characters per line.
2023-10-06 12:18:30 +02:00
Dawid Medrek
6c10a86791 db/hints: Remove boilerplate drain_lock() 2023-10-06 12:18:30 +02:00
Dawid Medrek
f1f35ba819 db/hints: Let drain_for() return a future
Currently, the function doesn't return anything.
However, if the futurue doesn't need to be awaited,
the caller can decide that. There is no reason
to make that decision in the function itself.
2023-10-06 12:18:25 +02:00
Dawid Medrek
79e1412f14 db/hints: Remove ep_managers_end
The methods are redundant and are effectively
code boilerplate.
2023-10-06 12:15:04 +02:00
Dawid Medrek
cfbacb29bb db/hints: Remove find_ep_manager
The methods are redundant and are effectively
code boilerplate.
2023-10-06 12:15:04 +02:00
Dawid Medrek
1c70a18fc7 db/hints: Use manager as API for hint_endpoint_manager
This commit makes with_file_update_mutex() a method of hint_endpoint_manager
and introduces db::hints::manager::with_file_update_mutex_for() for accessing
it from the outside. This way, hint_endpoint_manager is hidden and no one
needs to know about its existence.
2023-10-06 12:15:01 +02:00
Dawid Medrek
d068143b83 db/hints: Don't mark have_ep_manager()'s definition as inline
Doing that doesn't allow for external linkage, so
it's not accessible from other files.
2023-10-06 11:54:15 +02:00
Dawid Medrek
4663f72990 db/hints: Move ~manager() and mark it as noexcept
The destructor is trivial and there is no reason
to keep in the source file. We mark it as noexcept too.
2023-10-06 11:54:15 +02:00
Dawid Medrek
18a2831186 db/hints: Use reference for storage proxy
This commit makes db::hints::manager store service::storage_proxy
as a reference instead of a seastar::shared_ptr. The manager is
owned by storage proxy, so it only lives as long as storage proxy
does. Hence, it makes little sense to store the latter as a shared
pointer; in fact, it's very confusing and may be error-prone.
The field never changes, so it's safe to keep it as a reference
(especially because copy and move constructors of db::hints::manager
are both deleted). What's more, we ensure that the hint manager
has access to storage proxy as soon as it's created.

The same changes were applied to db::hints::resource_manager.
The rationale is the same.
2023-10-06 11:54:15 +02:00
Dawid Medrek
ee5a5c1661 db/hints: Capitalize constants
This is a common convention. Follow it for readability.
2023-10-06 11:54:15 +02:00