scylladb

Author	SHA1	Message	Date
Piotr Dulikowski	61ac0a336d	hints: send hints with CL=ALL if target is leaving Currently, when attempting to send a hint, we might choose its recipients in one of two ways: - If the original destination is a natural endpoint of the hint, we only send the hint to that node and none other, - Otherwise, we send the hint to all current replicas of the mutation. There is a problem when we decommission a node: while data is streamed away from that node, it is still considered to be a natural endpoint of the data that it used to own. Because of that, it might happen that a hint is sent directly to it but streaming will miss it, effectively resulting in the hint being discarded. As sending the hint _only_ to the leaving replica is a rather bad idea, send the hint to all replicas also in the case when the original destiantion of the hint is leaving. Note that this is a conservative fix written only with the decommission + vnode-based keyspaces combo in mind. In general, such "data loss" can occur in other situations where the replica set is changing and we go through a streaming phase, i.e. other topology operations in case of vnodes and tablet load balancing. However, the consistency guarantees of hinted handoff in the face of topology changes are not defined and it is not clear what they should be, if there should be any at all. The picture is further complicated by the fact that hints are used by materialized views, and sending view updates to more replicas than necessary can introduce inconsistencies in the form of "ghost rows". This fix was developed in response to a failing test which checked the hint replay + decommission scenario, and it makes it work again. Fixes scylladb/scylla-dtest#4582 Refs scylladb/scylladb#19835	2024-09-08 10:50:59 +02:00
Piotr Dulikowski	8abb06ab82	hints: inline do_send_one_mutation It's a small method and it is only used once in send_one_mutation. Inlining it lets us get rid of its declaration in the header - now, if one needs to change the variables passed from one function to another, it is no longer necessary to change the header.	2024-09-08 07:19:35 +02:00
Dawid Medrek	d459cf91eb	db/hints: Fix indentation in `do_store_hint()`	2024-08-29 14:47:08 +02:00
Dawid Medrek	75ce6943d0	db/hints: Move code for writing hints to separate function In scylladb/scylladb@7301a96, in the function `hint_endpoint_manager::store_hint()`, we transformed the lambda passed to `seastar::with_gate()` to a coroutine lambda to improve the readability. However, there was a subtle problem related to lifetimes of the captures that needed to be addressed: * Since we started `co_await`ing in the lambda, the captures were at risk of being destructed too soon. The usual solution is to wrap a coroutine lambda within a `seastar::coroutine::lambda` object and rely on the extended lifetime enforced by the semantics of the language. See `docs/dev/lambda-coroutine-fiasco.md` for more context. * However, since we don't immediately `co_await` the future returned by `with_gate()`, we cannot rely on the extended lifetime provided by the wrapper. The document linked in the previous bullet point suggests keeping the passed coroutine lambda as a variable and pass it as a reference to `with_gate()`. However, that's not feasible either because we discard the returned future and the function returns almost instantly -- destructing every local object, which would encompass the lambda too. The solution used in the commit was to move captures of the lambda into the lambda's body. That helped because Seastar's backend is responsible for keeping all of the local variables alive until the lambda finishes its execution. However, we didn't move all of the captures into the lambda -- the missing one was the `this` pointer that was implicitly used in the lambda. Address sanitiser hasn't reported any bugs related to the pointer yet, but the bug is most likely there. In this commit, we transform the lambda's body into a new member function and only call it from the lambda. This way, we don't need to care about the lifetimes of the captures because Seastar ensures that the function's arguments stay alive until the coroutine finishes. Choosing this solution instead of assigning `this` to a pointer variable inside the lambda's body and using it to refer to the object's members has actual benefit: it's not possible to accidentally forget to refer to a member of the object via the pointer; it also makes the code less awkward.	2024-08-29 14:47:02 +02:00
Dawid Medrek	e5d01d4000	db/hints: Make commitlog use commitlog IO scheduling group Before these changes, we didn't specify which I/O scheduling group commitlog instances in hinted handoff should use. In this commit, we set it explicitly to the commitlog scheduling group. The rationale for this choice is the fact we don't want to cause a bottleneck on the write path -- if hints are written too slowly, new incoming mutations (NOT hints) might be rejected due to a too high number of hints currently being written to disk; see `storage_proxy::create_write_response_handler_helper()` for more context. Fixes scylladb/scylladb#18654 Closes scylladb/scylladb#19170	2024-08-08 16:14:07 +02:00
Avi Kivity	aa1270a00c	treewide: change assert() to SCYLLA_ASSERT() assert() is traditionally disabled in release builds, but not in scylladb. This hasn't caused problems so far, but the latest abseil release includes a commit [1] that causes a 1000 insn/op regression when NDEBUG is not defined. Clearly, we must move towards a build system where NDEBUG is defined in release builds. But we can't just define it blindly without vetting all the assert() calls, as some were written with the expectation that they are enabled in release mode. To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT() macro in utils/assert.hh. This macro is always defined and is not conditional on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release mode. [1] `66ef711d68` Closes scylladb/scylladb#20006	2024-08-05 08:23:35 +03:00
Pavel Emelyanov	dd7c7c301d	hints: Const-ify gossiper references and anchor pointers There are two places in hints code that need gossiper: hist_sender calling gossiper::is_alive() and endpoint_downtime_not_bigger_than() helper in manager. Both can live with const gossiper, so the dependency references and anchor pointers can be restricted to const too. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-07-26 16:28:54 +03:00
Dawid Medrek	8b6e887e02	db/hints: Verify that Scylla limits the concurrency of written hints In `6e79d64`, the behavior of `manager::too_many_in_flight_hints_for()` was accidentally modified. It remained unnoticed for some time and then fixed. In this commit, we add a test verifying that the concurrency of hints being written to disk is indeed limited and the limitations are imposed properly.	2024-07-18 13:49:29 +02:00
Dawid Medrek	7301a96ff4	db/hints: Coroutinize `hint_endpoint_manager::store_hint()`	2024-07-15 04:15:25 +02:00
Dawid Medrek	3e02e66ca8	db/hints: Move a constant value to the TU it's used in Until now, the constant `HINT_FILE_WRITE_TIMEOUT` was declared as a static member of `db::hints::manager`. However, the constant is only ever used in one translation unit, so it makes more sense to move it there and not include boilerplate in a header.	2024-07-12 13:08:33 +02:00
Dawid Medrek	dc41086c57	db/hints: Add a metric for the size of sent hints In this commit, we add a new metric `sent_total_size` keeping track of how many bytes of hints a node has sent. The metric is supposed to complement its counterpart in storage proxy that counts how many bytes of hints a node has received. That information should prove useful in analyzing statistics of a cluster -- load on given nodes and where it comes from. We also change the name of the matric `sent` to `sent_total` to avoid the conflict of prefixes between the two metrics.	2024-06-12 18:20:08 +02:00
Piotr Dulikowski	64ba620dc2	Merge 'hinted handoff: Use host IDs instead of IPs in the module' from Dawid Mędrek This pull request introduces host ID in the Hinted Handoff module. Nodes are now identified by their host IDs instead of their IPs. The conversion occurs on the boundary between the module and `storage_proxy.hh`, but aside from that, IPs have been erased. The changes take into considerations that there might still be old hints, still identified by IPs, on disk – at start-up, we map them to host IDs if it's possible so that they're not lost. Refs scylladb/scylladb#6403 Fixes scylladb/scylladb#12278 Closes scylladb/scylladb#15567 * github.com:scylladb/scylladb: docs: Update Hinted Handoff documentation db/hints: Add endpoint_downtime_not_bigger_than() db/hints: Migrate hinted handoff when cluster feature is enabled db/hints: Handle arbitrary directories in resource manager db/hints: Start using hint_directory_manager db/hints: Enforce providing IP in get_ep_manager() db/hints: Introduce hint_directory_manager db/hints/resource_manager: Update function description db/hints: Coroutinize space_watchdog::scan_one_ep_dir() db/hints: Expose update lock of space watchdog db/hints: Add function for migrating hint directories to host ID db/hints: Take both IP and host ID when storing hints db/hints: Prepare initializing endpoint managers for migrating from IP to host ID db/hints: Migrate to locator::host_id db/hints: Remove noexcept in do_send_one_mutation() service: Add locator::host_id to on_leave_cluster service: Fix indentation db/hints: Fix indentation	2024-05-06 09:58:18 +02:00
Benny Halevy	ebff5f5d70	everywhere: include seastar headers using angle brackets seastar is an external library therefore it should use the system-include syntax. Closes scylladb/scylladb#18513	2024-05-06 10:00:31 +03:00
Dawid Medrek	d0f58736c8	db/hints: Introduce hint_directory_manager This commit introduces a new class responsible for keeping track of mappings IP-host ID. Before hinted handoff is migrated to using host IDs, hint directories still have to represent IP addresses. However, since we identify endpoint managers by host IDs already, we need to be able to associate them with the directories they manage. This class serves this purpose.	2024-04-27 22:31:07 +02:00
Dawid Medrek	063d4d5e91	db/hints: Prepare initializing endpoint managers for migrating from IP to host ID We extract the initialization of endpoint managers from the start method of the hint manager to a separate function and make it handle directories that represent either IP addresses, or host IDs; other directories are ignored. It's necessary because before Scylla is upgraded to a version that uses host-ID-based hinted handoff, we need to continue only managing IP directories. When Scylla has been upgraded, we will need to handle host ID directories. It may also happen that after an upgrade (but not before it), Scylla fails while renaming the directories, so we end up with some of them representing IP address, and some representing host IDs. After these changes, the code handles that scenario as well.	2024-04-27 20:35:53 +02:00
Dawid Medrek	cfd03fe273	db/hints: Migrate to locator::host_id We change the type of node identifiers used within the module and fix compilation. Directories storing hints to specific nodes are now represented by host IDs instead of IPs.	2024-04-26 22:44:04 +02:00
Dawid Medrek	1af7fa74e8	db/hints: Remove noexcept in do_send_one_mutation() While the function is marked as noexcept, the returned future can in fact store an exception. We remove the specifier to reflect the actual behavior of the function.	2024-04-26 22:44:04 +02:00
Dawid Medrek	c585444c60	db/hints: Fix indentation	2024-04-26 22:44:03 +02:00
Kefu Chai	a439ebcfce	treewide: include fmt/ranges.h and/or fmt/std.h before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we include `fmt/ranges.h` and/or `fmt/std.h` for formatting the container types, like vector, map optional and variant using {fmt} instead of the homebrew formatter based on operator<<. with this change, the changes adding fmt::formatter and the changes using ostream formatter explicitly, we are allowed to drop `FMT_DEPRECATED_OSTREAM` macro. Refs scylladb#13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-04-19 22:56:16 +08:00
Avi Kivity	7cb1c10fed	treewide: replace seastar::future::get0() with seastar::future::get() get0() dates back from the days where Seastar futures carried tuples, and get0() was a way to get the first (and usually only) element. Now it's a distraction, and Seastar is likely to deprecate and remove it. Replace with seastar::future::get(), which does the same thing.	2024-02-02 22:12:57 +08:00
Petr Gusev	e50dbef3e2	database: get_token_metadata -> new token_metadata database::get_token_metadata() is switched to token_metadata2. get_all_ips method is added to the host_id-based token_metadata, since its convenient and will be used in several places. It returns all current nodes converted to inet_address by means of the topology contained within token_metadata. hint_sender::can_send: if the node has already left the cluster we may not find its host_id. This case is handled in the same way as if it's not a normal token owner - we simply send a hint to all replicas.	2023-12-12 23:19:53 +04:00
Yaniv Kaul	ae2ab6000a	Typos: fix typos in code Fixes some more typos as found by codespell run on the code. In this commit, there are more user-visible errors. Refs: https://github.com/scylladb/scylladb/issues/16255	2023-12-05 15:18:11 +02:00
Yaniv Kaul	c658bdb150	Typos: fix typos in comments Fixes some typos as found by codespell run on the code. In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc. Follow-up commits will take care of them. Refs: https://github.com/scylladb/scylladb/issues/16255 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2023-12-02 22:37:22 +02:00
Dawid Medrek	1c70a18fc7	db/hints: Use manager as API for hint_endpoint_manager This commit makes with_file_update_mutex() a method of hint_endpoint_manager and introduces db::hints::manager::with_file_update_mutex_for() for accessing it from the outside. This way, hint_endpoint_manager is hidden and no one needs to know about its existence.	2023-10-06 12:15:01 +02:00
Dawid Medrek	ee5a5c1661	db/hints: Capitalize constants This is a common convention. Follow it for readability.	2023-10-06 11:54:15 +02:00
Dawid Medrek	a870eeb2ab	db/hints: Alias segment list in hint_storage.cc Naming the type should improve readability.	2023-09-27 18:49:08 +02:00
Dawid Medrek	aba85c9c98	db/hints: Rename rebalance to rebalance_hints The new name conveys the idea clearly.	2023-09-27 18:49:08 +02:00
Dawid Medrek	64f4b825d3	db/hints: Clean up rebalance() in hint_storage.cc This commit fixes indentation and formatting after recent changes in the file.	2023-09-27 18:49:04 +02:00
Dawid Medrek	b662756256	db/hints: Coroutinize hint_storage.cc	2023-09-27 18:47:38 +02:00
Dawid Medrek	17e763a83a	db/hints: Clean up remove_irrelevant_shards_directories() in hint_storage.cc This commit makes the function abide by the limit of 120 characters per line and stops unnecessarily calling c_str() on seastar::sstring.	2023-09-27 18:45:01 +02:00
Dawid Medrek	73d02cfcef	db/hints: Clean up rebalance_segments() in hint_storage.cc This commit makes the function less compact and turns overly long lines into shorter ones to improve the readability of the code.	2023-09-27 18:45:01 +02:00
Dawid Medrek	479f4d1ad3	db/hints: Clean up rebalance_segments_for() in hint_storage.cc This commit makes the function less compact and abides by the limit of 120 characters per line; that makes the code more readable. We start using fmt::to_string instead of seastar::format("{:d"}) to convert strings to integers -- the new way is the preferred one. The changes also name variables in a more descriptive way.	2023-09-27 18:45:01 +02:00
Dawid Medrek	a1df8dbf1c	db/hints: Clean up get_current_hints_segments() in hint_storage.cc This commit makes the function less compact and abides by the limit of 120 characters per line. That makes the code more readable. It also doesn't unnecessarily call c_str() on seastar::sstring.	2023-09-27 18:45:01 +02:00
Dawid Medrek	1fccd34dba	db/hints: Rename scan_for_hints_dirs to scan_shard_hint_directories The new name better conveys which directories the function should scan.	2023-09-27 18:45:01 +02:00
Dawid Medrek	8e94074b85	db/hints: Clean up scan_for_hints_dirs() in hint_storage.cc There is no need to call c_str() on the name of the directory entry. In fact, the used overload std::stoi() takes an std::string as its argument. Providing seastar::sstring instead of const char* is more efficient because we can allocate just the right amount of memory and std::memcpy it, i.e. call std::string(const char, std::size_t). Using the overload std::string(const char) would need to first traverse the string to find the null byte. This is a small change, all the more because paths don't tend to be long, but it's some gain nonetheless. The commit also inserts a few empty lines to make the code less compact and improve readability as a result.	2023-09-27 18:45:01 +02:00
Dawid Medrek	7c68882578	db/hints: Wrap hint_storage.cc in an anonymous namespace An anonymous namespace is a safer mechanism than the static keyword. When adding a new piece of code, it's easy to forget about adding the static. In that case, that code might undergo external linkage. However, when code is put in an anonymous namespace (when it should not), the linker will immediately detect it (in most cases), and the programmer will be able to spot and fix their mistake right away.	2023-09-27 18:41:41 +02:00
Dawid Medrek	d46437a87b	db/hints: Rename end_point_hints_manager This commit renames `end_point_hints_manager` to `hint_endpoint_manager` to be consistent with other names used in the module (they all start with `hint_`).	2023-09-15 03:46:15 +02:00
Dawid Medrek	6d1eee448b	db/hints: Rename sender to hint_sender We rename the structure to highlight what exactly its purpose is.	2023-09-15 03:46:15 +02:00
Dawid Medrek	4ad0f8907c	db/hints: Move the rebalancing logic to hint_storage This commit continues modularizing manager.hh.	2023-09-15 03:46:15 +02:00
Dawid Medrek	999484466d	db/hints: Move the implementation of sender This commit continues modularizing manager.hh. After moving the declaration of sender to a dedicated header file, these changes move its implementation to a separate source file.	2023-09-15 03:46:15 +02:00
Dawid Medrek	17aabf6b9a	db/hints: Move the declaration of sender to hint_sender.hh This commit is yet another step in modularizing manager.hh. We move the declaration of sender to a dedicated file. Its implementation will follow in a future commit.	2023-09-15 03:46:15 +02:00
Dawid Medrek	1a7262ed6e	db/hints: Move sender::replay_allowed() to the source file The premise of these changes is the fact that we cannot have a cycle of #includes. Because the declaration of `sender` is going to be moved to a separate header file in a future commit, and because that header file is going to be included in the file where `end_point_hints_manager` is declared, we will need to rely on `end_point_hints_manager` being an incomplete type there. A consequence of that is that we cannot access any of `end_point_hints_manager`'s methods. This commit prepares the ground for it by moving the definition of the function to the source file where `end_point_hints_manager` will be a complete type.	2023-09-15 03:46:15 +02:00
Dawid Medrek	ad2a36bd45	db/hints: Put end_point_hints_manager in internal namespace	2023-09-15 03:46:15 +02:00
Dawid Medrek	507054012d	db/hints: Move the implementation of end_point_hints_manager This commit continues moving end_point_hints_manager to its dedicated files. After moving the declaration of the class, these changes move the implementation.	2023-09-15 03:46:15 +02:00
Dawid Medrek	f72c423984	db/hints: Move the declaration of end_point_hints_manager This commit is yet another step in modularizing manager.hh. We move the declaration of the class to a dedicated header file. The implementation will follow in a future commit.	2023-09-15 03:46:15 +02:00
Dawid Medrek	db08a85f5d	db/hints: Introduce hint_storage.hh This commit moves types used by shard hint manager and related to storing hints on disk to another file. It is yet another step in modularizing manager.hh.	2023-09-15 02:28:10 +02:00
Dawid Medrek	4814b3b19a	db/hints: Extract the logger from manager.cc This commit extracts the logger used in manager.cc to prepare the ground for modularization of manager.hh into separate smaller files. We want to preserve the logging behavior (at least for the time being), which means new files should use the same logger. These changes serve that purpose.	2023-09-15 02:24:20 +02:00
Dawid Medrek	efd6d1f57a	db/hints: Extract common types from manager.hh Currently, data structures used in manager.hh use their own aliases for gms::inet_address. It is clear they all should use the same type and having different names for it only reduces readability of the code. This commit introduces a common alias -- endpoint_id -- and gets rid of the other ones. This commit is also the first step in modularizing manager.hh by extracting common types to another file.	2023-09-15 02:23:30 +02:00

48 Commits