Commit Graph

48 Commits

Author SHA1 Message Date
Piotr Dulikowski
61ac0a336d hints: send hints with CL=ALL if target is leaving
Currently, when attempting to send a hint, we might choose its
recipients in one of two ways:

- If the original destination is a natural endpoint of the hint, we only
  send the hint to that node and none other,
- Otherwise, we send the hint to all current replicas of the mutation.

There is a problem when we decommission a node: while data is streamed
away from that node, it is still considered to be a natural endpoint of
the data that it used to own. Because of that, it might happen that a
hint is sent directly to it but streaming will miss it, effectively
resulting in the hint being discarded.

As sending the hint _only_ to the leaving replica is a rather bad idea,
send the hint to all replicas also in the case when the original
destiantion of the hint is leaving.

Note that this is a conservative fix written only with the decommission
+ vnode-based keyspaces combo in mind. In general, such "data loss" can
occur in other situations where the replica set is changing and we go
through a streaming phase, i.e. other topology operations in case of
vnodes and tablet load balancing. However, the consistency guarantees of
hinted handoff in the face of topology changes are not defined and it is
not clear what they should be, if there should be any at all. The
picture is further complicated by the fact that hints are used by
materialized views, and sending view updates to more replicas than
necessary can introduce inconsistencies in the form of "ghost rows".
This fix was developed in response to a failing test which checked the
hint replay + decommission scenario, and it makes it work again.

Fixes scylladb/scylla-dtest#4582
Refs scylladb/scylladb#19835
2024-09-08 10:50:59 +02:00
Piotr Dulikowski
8abb06ab82 hints: inline do_send_one_mutation
It's a small method and it is only used once in send_one_mutation.
Inlining it lets us get rid of its declaration in the header - now, if
one needs to change the variables passed from one function to another,
it is no longer necessary to change the header.
2024-09-08 07:19:35 +02:00
Dawid Medrek
d459cf91eb db/hints: Fix indentation in do_store_hint() 2024-08-29 14:47:08 +02:00
Dawid Medrek
75ce6943d0 db/hints: Move code for writing hints to separate function
In scylladb/scylladb@7301a96, in the function `hint_endpoint_manager::store_hint()`,
we transformed the lambda passed to `seastar::with_gate()` to a coroutine lambda
to improve the readability. However, there was a subtle problem related to
lifetimes of the captures that needed to be addressed:

* Since we started `co_await`ing in the lambda, the captures were at risk of
  being destructed too soon. The usual solution is to wrap a coroutine lambda
  within a `seastar::coroutine::lambda` object and rely on the extended lifetime
  enforced by the semantics of the language.
  See `docs/dev/lambda-coroutine-fiasco.md` for more context.

* However, since we don't immediately `co_await` the future returned by
  `with_gate()`, we cannot rely on the extended lifetime provided by the wrapper.
  The document linked in the previous bullet point suggests keeping the passed
  coroutine lambda as a variable and pass it as a reference to `with_gate()`.
  However, that's not feasible either because we discard the returned future and
  the function returns almost instantly -- destructing every local object, which
  would encompass the lambda too.

The solution used in the commit was to move captures of the lambda into
the lambda's body. That helped because Seastar's backend is responsible for
keeping all of the local variables alive until the lambda finishes its execution.
However, we didn't move all of the captures into the lambda -- the missing one
was the `this` pointer that was implicitly used in the lambda.

Address sanitiser hasn't reported any bugs related to the pointer yet, but
the bug is most likely there.

In this commit, we transform the lambda's body into a new member function
and only call it from the lambda. This way, we don't need to care about
the lifetimes of the captures because Seastar ensures that the function's
arguments stay alive until the coroutine finishes.

Choosing this solution instead of assigning `this` to a pointer variable
inside the lambda's body and using it to refer to the object's members
has actual benefit: it's not possible to accidentally forget to refer
to a member of the object via the pointer; it also makes the code less
awkward.
2024-08-29 14:47:02 +02:00
Dawid Medrek
e5d01d4000 db/hints: Make commitlog use commitlog IO scheduling group
Before these changes, we didn't specify which I/O scheduling
group commitlog instances in hinted handoff should use.
In this commit, we set it explicitly to the commitlog
scheduling group. The rationale for this choice is the fact
we don't want to cause a bottleneck on the write path
-- if hints are written too slowly, new incoming mutations
(NOT hints) might be rejected due to a too high number
of hints currently being written to disk; see
`storage_proxy::create_write_response_handler_helper()`
for more context.

Fixes scylladb/scylladb#18654

Closes scylladb/scylladb#19170
2024-08-08 16:14:07 +02:00
Avi Kivity
aa1270a00c treewide: change assert() to SCYLLA_ASSERT()
assert() is traditionally disabled in release builds, but not in
scylladb. This hasn't caused problems so far, but the latest abseil
release includes a commit [1] that causes a 1000 insn/op regression when
NDEBUG is not defined.

Clearly, we must move towards a build system where NDEBUG is defined in
release builds. But we can't just define it blindly without vetting
all the assert() calls, as some were written with the expectation that
they are enabled in release mode.

To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT()
macro in utils/assert.hh. This macro is always defined and is not conditional
on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release
mode.

[1] 66ef711d68

Closes scylladb/scylladb#20006
2024-08-05 08:23:35 +03:00
Pavel Emelyanov
dd7c7c301d hints: Const-ify gossiper references and anchor pointers
There are two places in hints code that need gossiper: hist_sender
calling gossiper::is_alive() and endpoint_downtime_not_bigger_than()
helper in manager. Both can live with const gossiper, so the dependency
references and anchor pointers can be restricted to const too.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-07-26 16:28:54 +03:00
Dawid Medrek
8b6e887e02 db/hints: Verify that Scylla limits the concurrency of written hints
In 6e79d64, the behavior of `manager::too_many_in_flight_hints_for()`
was accidentally modified. It remained unnoticed for some time
and then fixed. In this commit, we add a test verifying that
the concurrency of hints being written to disk is indeed limited
and the limitations are imposed properly.
2024-07-18 13:49:29 +02:00
Dawid Medrek
7301a96ff4 db/hints: Coroutinize hint_endpoint_manager::store_hint() 2024-07-15 04:15:25 +02:00
Dawid Medrek
3e02e66ca8 db/hints: Move a constant value to the TU it's used in
Until now, the constant `HINT_FILE_WRITE_TIMEOUT` was
declared as a static member of `db::hints::manager`.
However, the constant is only ever used in one
translation unit, so it makes more sense to move it
there and not include boilerplate in a header.
2024-07-12 13:08:33 +02:00
Dawid Medrek
dc41086c57 db/hints: Add a metric for the size of sent hints
In this commit, we add a new metric `sent_total_size`
keeping track of how many bytes of hints a node
has sent. The metric is supposed to complement its
counterpart in storage proxy that counts how many
bytes of hints a node has received. That information
should prove useful in analyzing statistics of
a cluster -- load on given nodes and where it comes
from.

We also change the name of the matric `sent`
to `sent_total` to avoid the conflict of prefixes
between the two metrics.
2024-06-12 18:20:08 +02:00
Piotr Dulikowski
64ba620dc2 Merge 'hinted handoff: Use host IDs instead of IPs in the module' from Dawid Mędrek
This pull request introduces host ID in the Hinted Handoff module. Nodes are now identified by their host IDs instead of their IPs. The conversion occurs on the boundary between the module and `storage_proxy.hh`, but aside from that, IPs have been erased.

The changes take into considerations that there might still be old hints, still identified by IPs, on disk – at start-up, we map them to host IDs if it's possible so that they're not lost.

Refs scylladb/scylladb#6403
Fixes scylladb/scylladb#12278

Closes scylladb/scylladb#15567

* github.com:scylladb/scylladb:
  docs: Update Hinted Handoff documentation
  db/hints: Add endpoint_downtime_not_bigger_than()
  db/hints: Migrate hinted handoff when cluster feature is enabled
  db/hints: Handle arbitrary directories in resource manager
  db/hints: Start using hint_directory_manager
  db/hints: Enforce providing IP in get_ep_manager()
  db/hints: Introduce hint_directory_manager
  db/hints/resource_manager: Update function description
  db/hints: Coroutinize space_watchdog::scan_one_ep_dir()
  db/hints: Expose update lock of space watchdog
  db/hints: Add function for migrating hint directories to host ID
  db/hints: Take both IP and host ID when storing hints
  db/hints: Prepare initializing endpoint managers for migrating from IP to host ID
  db/hints: Migrate to locator::host_id
  db/hints: Remove noexcept in do_send_one_mutation()
  service: Add locator::host_id to on_leave_cluster
  service: Fix indentation
  db/hints: Fix indentation
2024-05-06 09:58:18 +02:00
Benny Halevy
ebff5f5d70 everywhere: include seastar headers using angle brackets
seastar is an external library therefore it should
use the system-include syntax.

Closes scylladb/scylladb#18513
2024-05-06 10:00:31 +03:00
Dawid Medrek
d0f58736c8 db/hints: Introduce hint_directory_manager
This commit introduces a new class responsible
for keeping track of mappings IP-host ID.
Before hinted handoff is migrated to using
host IDs, hint directories still have to
represent IP addresses. However, since
we identify endpoint managers by host IDs
already, we need to be able to associate
them with the directories they manage.
This class serves this purpose.
2024-04-27 22:31:07 +02:00
Dawid Medrek
063d4d5e91 db/hints: Prepare initializing endpoint managers for migrating from IP to host ID
We extract the initialization of endpoint managers
from the start method of the hint manager
to a separate function and make it handle directories
that represent either IP addresses, or host IDs;
other directories are ignored.

It's necessary because before Scylla is upgraded
to a version that uses host-ID-based hinted handoff,
we need to continue only managing IP directories.
When Scylla has been upgraded, we will need to handle
host ID directories.

It may also happen that after an upgrade (but not
before it), Scylla fails while renaming
the directories, so we end up with some of them
representing IP address, and some representing
host IDs. After these changes, the code handles
that scenario as well.
2024-04-27 20:35:53 +02:00
Dawid Medrek
cfd03fe273 db/hints: Migrate to locator::host_id
We change the type of node identifiers
used within the module and fix compilation.
Directories storing hints to specific nodes
are now represented by host IDs instead of
IPs.
2024-04-26 22:44:04 +02:00
Dawid Medrek
1af7fa74e8 db/hints: Remove noexcept in do_send_one_mutation()
While the function is marked as noexcept, the returned
future can in fact store an exception. We remove the
specifier to reflect the actual behavior of the
function.
2024-04-26 22:44:04 +02:00
Dawid Medrek
c585444c60 db/hints: Fix indentation 2024-04-26 22:44:03 +02:00
Kefu Chai
a439ebcfce treewide: include fmt/ranges.h and/or fmt/std.h
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we include `fmt/ranges.h` and/or `fmt/std.h`
for formatting the container types, like vector, map
optional and variant using {fmt} instead of the homebrew
formatter based on operator<<.
with this change, the changes adding fmt::formatter and
the changes using ostream formatter explicitly, we are
allowed to drop `FMT_DEPRECATED_OSTREAM` macro.

Refs scylladb#13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-04-19 22:56:16 +08:00
Avi Kivity
7cb1c10fed treewide: replace seastar::future::get0() with seastar::future::get()
get0() dates back from the days where Seastar futures carried tuples, and
get0() was a way to get the first (and usually only) element. Now
it's a distraction, and Seastar is likely to deprecate and remove it.

Replace with seastar::future::get(), which does the same thing.
2024-02-02 22:12:57 +08:00
Petr Gusev
e50dbef3e2 database: get_token_metadata -> new token_metadata
database::get_token_metadata() is switched to token_metadata2.

get_all_ips method is added to the host_id-based token_metadata, since
its convenient and will be used in several places. It returns all current
nodes converted to inet_address by means of the topology
contained within token_metadata.

hint_sender::can_send: if the node has already left the
cluster we may not find its host_id. This case is handled
in the same way as if it's not a normal token owner - we
simply send a hint to all replicas.
2023-12-12 23:19:53 +04:00
Yaniv Kaul
ae2ab6000a Typos: fix typos in code
Fixes some more typos as found by codespell run on the code.
In this commit, there are more user-visible errors.

Refs: https://github.com/scylladb/scylladb/issues/16255
2023-12-05 15:18:11 +02:00
Yaniv Kaul
c658bdb150 Typos: fix typos in comments
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.

Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
2023-12-02 22:37:22 +02:00
Dawid Medrek
1c70a18fc7 db/hints: Use manager as API for hint_endpoint_manager
This commit makes with_file_update_mutex() a method of hint_endpoint_manager
and introduces db::hints::manager::with_file_update_mutex_for() for accessing
it from the outside. This way, hint_endpoint_manager is hidden and no one
needs to know about its existence.
2023-10-06 12:15:01 +02:00
Dawid Medrek
ee5a5c1661 db/hints: Capitalize constants
This is a common convention. Follow it for readability.
2023-10-06 11:54:15 +02:00
Dawid Medrek
a870eeb2ab db/hints: Alias segment list in hint_storage.cc
Naming the type should improve readability.
2023-09-27 18:49:08 +02:00
Dawid Medrek
aba85c9c98 db/hints: Rename rebalance to rebalance_hints
The new name conveys the idea clearly.
2023-09-27 18:49:08 +02:00
Dawid Medrek
64f4b825d3 db/hints: Clean up rebalance() in hint_storage.cc
This commit fixes indentation and formatting after
recent changes in the file.
2023-09-27 18:49:04 +02:00
Dawid Medrek
b662756256 db/hints: Coroutinize hint_storage.cc 2023-09-27 18:47:38 +02:00
Dawid Medrek
17e763a83a db/hints: Clean up remove_irrelevant_shards_directories() in hint_storage.cc
This commit makes the function abide by the limit of 120 characters
per line and stops unnecessarily calling c_str() on seastar::sstring.
2023-09-27 18:45:01 +02:00
Dawid Medrek
73d02cfcef db/hints: Clean up rebalance_segments() in hint_storage.cc
This commit makes the function less compact and turns overly
long lines into shorter ones to improve the readability of
the code.
2023-09-27 18:45:01 +02:00
Dawid Medrek
479f4d1ad3 db/hints: Clean up rebalance_segments_for() in hint_storage.cc
This commit makes the function less compact and abides by the limit
of 120 characters per line; that makes the code more readable.
We start using fmt::to_string instead of seastar::format("{:d"})
to convert strings to integers -- the new way is the preferred one.
The changes also name variables in a more descriptive way.
2023-09-27 18:45:01 +02:00
Dawid Medrek
a1df8dbf1c db/hints: Clean up get_current_hints_segments() in hint_storage.cc
This commit makes the function less compact and abides by the limit
of 120 characters per line. That makes the code more readable.
It also doesn't unnecessarily call c_str() on seastar::sstring.
2023-09-27 18:45:01 +02:00
Dawid Medrek
1fccd34dba db/hints: Rename scan_for_hints_dirs to scan_shard_hint_directories
The new name better conveys which directories the function should scan.
2023-09-27 18:45:01 +02:00
Dawid Medrek
8e94074b85 db/hints: Clean up scan_for_hints_dirs() in hint_storage.cc
There is no need to call c_str() on the name of the directory entry.
In fact, the used overload std::stoi() takes an std::string as its
argument. Providing seastar::sstring instead of const char* is more
efficient because we can allocate just the right amount of memory
and std::memcpy it, i.e. call std::string(const char*, std::size_t).
Using the overload std::string(const char*) would need to first
traverse the string to find the null byte.

This is a small change, all the more because paths don't tend to
be long, but it's some gain nonetheless.

The commit also inserts a few empty lines to make the code less
compact and improve readability as a result.
2023-09-27 18:45:01 +02:00
Dawid Medrek
7c68882578 db/hints: Wrap hint_storage.cc in an anonymous namespace
An anonymous namespace is a safer mechanism than the static
keyword. When adding a new piece of code, it's easy to
forget about adding the static. In that case, that code
might undergo external linkage. However, when code is put
in an anonymous namespace (when it should not), the linker
will immediately detect it (in most cases), and
the programmer will be able to spot and fix their mistake
right away.
2023-09-27 18:41:41 +02:00
Dawid Medrek
d46437a87b db/hints: Rename end_point_hints_manager
This commit renames `end_point_hints_manager` to `hint_endpoint_manager`
to be consistent with other names used in the module (they all start
with `hint_`).
2023-09-15 03:46:15 +02:00
Dawid Medrek
6d1eee448b db/hints: Rename sender to hint_sender
We rename the structure to highlight what exactly its purpose is.
2023-09-15 03:46:15 +02:00
Dawid Medrek
4ad0f8907c db/hints: Move the rebalancing logic to hint_storage
This commit continues modularizing manager.hh.
2023-09-15 03:46:15 +02:00
Dawid Medrek
999484466d db/hints: Move the implementation of sender
This commit continues modularizing manager.hh.
After moving the declaration of sender to a dedicated
header file, these changes move its implementation to
a separate source file.
2023-09-15 03:46:15 +02:00
Dawid Medrek
17aabf6b9a db/hints: Move the declaration of sender to hint_sender.hh
This commit is yet another step in modularizing manager.hh.
We move the declaration of sender to a dedicated file.
Its implementation will follow in a future commit.
2023-09-15 03:46:15 +02:00
Dawid Medrek
1a7262ed6e db/hints: Move sender::replay_allowed() to the source file
The premise of these changes is the fact that we cannot have
a cycle of #includes.

Because the declaration of `sender` is going to be moved to
a separate header file in a future commit, and because that
header file is going to be included in the file where
`end_point_hints_manager` is declared, we will need to rely
on `end_point_hints_manager` being an incomplete type there.

A consequence of that is that we cannot access any of
`end_point_hints_manager`'s methods.

This commit prepares the ground for it by moving
the definition of the function to the source file where
`end_point_hints_manager` will be a complete type.
2023-09-15 03:46:15 +02:00
Dawid Medrek
ad2a36bd45 db/hints: Put end_point_hints_manager in internal namespace 2023-09-15 03:46:15 +02:00
Dawid Medrek
507054012d db/hints: Move the implementation of end_point_hints_manager
This commit continues moving end_point_hints_manager to its
dedicated files. After moving the declaration of the class,
these changes move the implementation.
2023-09-15 03:46:15 +02:00
Dawid Medrek
f72c423984 db/hints: Move the declaration of end_point_hints_manager
This commit is yet another step in modularizing manager.hh.
We move the declaration of the class to a dedicated header file.
The implementation will follow in a future commit.
2023-09-15 03:46:15 +02:00
Dawid Medrek
db08a85f5d db/hints: Introduce hint_storage.hh
This commit moves types used by shard hint manager
and related to storing hints on disk to another file.
It is yet another step in modularizing manager.hh.
2023-09-15 02:28:10 +02:00
Dawid Medrek
4814b3b19a db/hints: Extract the logger from manager.cc
This commit extracts the logger used in manager.cc
to prepare the ground for modularization of manager.hh
into separate smaller files. We want to preserve
the logging behavior (at least for the time being),
which means new files should use the same logger.
These changes serve that purpose.
2023-09-15 02:24:20 +02:00
Dawid Medrek
efd6d1f57a db/hints: Extract common types from manager.hh
Currently, data structures used in manager.hh
use their own aliases for gms::inet_address.
It is clear they all should use the same type
and having different names for it only reduces
readability of the code. This commit introduces
a common alias -- endpoint_id -- and gets rid
of the other ones.

This commit is also the first step in modularizing
manager.hh by extracting common types to another
file.
2023-09-15 02:23:30 +02:00