scylladb

Author	SHA1	Message	Date
Dawid Mędrek	0a6137218a	db/hints: Cancel draining when stopping node Draining hints may occur in one of the two scenarios: * a node leaves the cluster and the local node drains all of the hints saved for that node, * the local node is being decommissioned. Draining may take some time and the hint manager won't stop until it finishes. It's not a problem when decommissioning a node, especially because we want the cluster to retain the data stored in the hints. However, it may become a problem when the local node started draining hints saved for another node and now it's being shut down. There are two reasons for that: * Generally, in situations like that, we'd like to be able to shut down nodes as fast as possible. The data stored in the hints won't disappear from the cluster yet since we can restart the local node. * Draining hints may introduce flakiness in tests. Replaying hints doesn't have the highest priority and it's reflected in the scheduling groups we use as well as the explicitly enforced throughput. If there are a large number of hints to be replayed, it might affect our tests. It's already happened, see: scylladb/scylladb#21949. To solve those problems, we change the semantics of draining. It will behave as before when the local node is being decommissioned. However, when the local node is only being stopped, we will immediately cancel all ongoing draining processes and stop the hint manager. To amend for that, when we start a node and it initializes a hint endpoint manager corresponding to a node that's already left the cluster, we will begin the draining process of that endpoint manager right away. That should ensure all data is retained, while possibly speeding up the shutdown process. There's a small trade-off to it, though. If we stop a node, we can then remove it. It won't have a chance to replay hints it might've before these changes, but that's an edge case. We expect this commit to bring more benefit than harm. We also provide tests verifying that the implementation works as intended. Fixes scylladb/scylladb#21949 Closes scylladb/scylladb#22811	2025-03-13 11:55:15 +02:00
Kefu Chai	6e4cb20a69	tree: implement boost::accumulate with std::ranges library Replace boost::accumulate() calls with std::ranges facilities. This change reduces external dependencies and modernizes the codebase. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#23062	2025-02-26 23:22:02 +02:00
Kefu Chai	7ff0d7ba98	tree: Remove unused boost headers This commit eliminates unused boost header includes from the tree. Removing these unnecessary includes reduces dependencies on the external Boost.Adapters library, leading to faster compile times and a slightly cleaner codebase. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22857	2025-02-15 20:32:22 +02:00
Avi Kivity	f3eade2f62	treewide: relicense to ScyllaDB-Source-Available-1.0 Drop the AGPL license in favor of a source-available license. See the blog post [1] for details. [1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/	2024-12-18 17:45:13 +02:00
Avi Kivity	9024e4940c	counters.hh: drop unused boost includes Re-add them to source files that need them. Closes scylladb/scylladb#21738	2024-12-05 12:27:41 +02:00
Kefu Chai	f436edfa22	mutation: remove unused "#include"s these unused includes are identified by clang-include-cleaner. after auditing the source files, all of the reports have been confirmed. please note, because `mutation/mutation.hh` does not include `seastar/coroutine/maybe_yield.hh` anymore, and quite a few source files were relying on this header to bring in the declaration of `maybe_yield()`, we have to include this header in the places where this symbol is used. the same applies to `seastar/core/when_all.hh`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-11-29 14:01:44 +08:00
Kefu Chai	24d14b601b	treewide: s/boost::adaptors::map_values/std::views::values/ now that we are allowed to use C++23. we now have the luxury of using `std::views::values`. in this change, we: - replace `boost::adaptors::map_values` with `std::views::values` - update affected code to work with `std::views::values` - the places where we use `boost::join()` are not changed, because we cannot use `std::views::concat` yet. this helper is only available in C++26. to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support. this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21265	2024-10-27 21:32:45 +02:00
Kefu Chai	6ead5a4696	treewide: move log.hh into utils/log.hh the log.hh under the root of the tree was created keep the backward compatibility when seastar was extracted into a separate library. so log.hh should belong to `utils` directory, as it is based solely on seastar, and can be used all subsystems. in this change, we move log.hh into utils/log.hh to that it is more modularized. and this also improves the readability, when one see `#include "utils/log.hh"`, it is obvious that this source file needs the logging system, instead of its own log facility -- please note, we do have two other `log.hh` in the tree. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-10-22 06:54:46 +03:00
Pavel Emelyanov	dd7c7c301d	hints: Const-ify gossiper references and anchor pointers There are two places in hints code that need gossiper: hist_sender calling gossiper::is_alive() and endpoint_downtime_not_bigger_than() helper in manager. Both can live with const gossiper, so the dependency references and anchor pointers can be restricted to const too. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-07-26 16:28:54 +03:00
Dawid Medrek	2446cce272	db/hints: Initialize endpoint managers only for valid hint directories Before these changes, it could happen that Scylla initialized endpoint managers for hint directories representing * host IDs before migrating hinted handoff to using host IDs, * IP addresses after the migration. One scenario looked like this: 1. Start Scylla and upgrade the cluster to using host IDs. 2. Create, by hand, a hint directory representing an IP address. 3. Trigger changing the host filter in hinted handoff; it could be achieved by, for example, restricting the set of data centers Scylla is allowed to save hints for. When changing the host filter, we browse the hint directories and create endpoint managers if we can send hints towards the node corresponding to a given hint directory. We only accepted hint directories representing IP addresses and host IDs. However, we didn't check whether the local node has already been upgraded to host-ID-based hinted handoff or not. As a result, endpoint managers were created for both IP addresses and host IDs, no matter whether we were before or after the migration. These changes make sure that any time we browse the hint directories, we take that into account. Fixes scylladb/scylladb#19172 Closes scylladb/scylladb#19173	2024-06-21 15:59:49 +02:00
Dawid Medrek	a5528a2093	db/hints: Log when ignoring invalid hint directories In `58784cd`, `aa4b06a` and other commits migrating hinted handoff from IPs to host IDs (scylladb/scylladb#15567), we started ignoring hint directories of invalid names, i.e. those that represent neither an IP address, nor a host ID. They remain on disk and are taken into account while computing e.g. the total size of hints, but they're not used in any way. These changes add logs informing the user when Scylla encounters such a directory. Closes scylladb/scylladb#17566	2024-06-07 19:19:15 +02:00
Dawid Medrek	58784cd8db	db/hints: Handle arbitrary directories in resource manager Before these changes, resource manager only handled the case when directories it browsed represented valid host IDs. However, since before migrating hinted handoff to using host IDs we still name directories after IP addresses, that would lead to exceptins that shouldn't happen. We make resource manager handle directories of arbitrary names correctly.	2024-04-27 22:31:07 +02:00
Dawid Medrek	59d49c5219	db/hints: Coroutinize space_watchdog::scan_one_ep_dir()	2024-04-27 22:31:07 +02:00
Dawid Medrek	cfd03fe273	db/hints: Migrate to locator::host_id We change the type of node identifiers used within the module and fix compilation. Directories storing hints to specific nodes are now represented by host IDs instead of IPs.	2024-04-26 22:44:04 +02:00
Avi Kivity	7cb1c10fed	treewide: replace seastar::future::get0() with seastar::future::get() get0() dates back from the days where Seastar futures carried tuples, and get0() was a way to get the first (and usually only) element. Now it's a distraction, and Seastar is likely to deprecate and remove it. Replace with seastar::future::get(), which does the same thing.	2024-02-02 22:12:57 +08:00
Dawid Medrek	1c70a18fc7	db/hints: Use manager as API for hint_endpoint_manager This commit makes with_file_update_mutex() a method of hint_endpoint_manager and introduces db::hints::manager::with_file_update_mutex_for() for accessing it from the outside. This way, hint_endpoint_manager is hidden and no one needs to know about its existence.	2023-10-06 12:15:01 +02:00
Dawid Medrek	18a2831186	db/hints: Use reference for storage proxy This commit makes db::hints::manager store service::storage_proxy as a reference instead of a seastar::shared_ptr. The manager is owned by storage proxy, so it only lives as long as storage proxy does. Hence, it makes little sense to store the latter as a shared pointer; in fact, it's very confusing and may be error-prone. The field never changes, so it's safe to keep it as a reference (especially because copy and move constructors of db::hints::manager are both deleted). What's more, we ensure that the hint manager has access to storage proxy as soon as it's created. The same changes were applied to db::hints::resource_manager. The rationale is the same.	2023-10-06 11:54:15 +02:00
Pavel Emelyanov	bc62ca46d4	lister: Make lister::dir_entry_types an enum_set This type is currently an unordered_set, but only consists of at most two elements. Making it an enum_set renders it into a size_t variable and better describes the intention. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-11-17 19:01:45 +03:00
Benny Halevy	ebbbf1e687	lister: move to utils There's nothing specific to scylla in the lister classes, they could (and maybe should) be part of the seastar library. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-02-28 12:36:03 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Benny Halevy	f4cd535e3d	resource_manager: remove unnecessary include of lister.hh from header file But define namespace fs = std::filesystem in the header since many use sites already depend on it and it's a convention throught scylla's code. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-01-11 17:04:16 +02:00
Piotr Dulikowski	de1679b1b9	hints: make hints concurrency configurable and reduce the default Previously, hinted handoff had a hardcoded concurrency limit - at most 128 hints could be sent from a single shard at once. This commit makes this limit configurable by adding a new configuration option: `max_hinted_handoff_concurrency_per_shard`. This option can be updated in runtime. Additionally, the default concurrency per shard is made lower and is now 8. The motivation for reducing the concurrency was to mitigate the negative impact hints may have on performance of the receiving node due to them not being properly isolated with respect to I/O. Tests: - unit(dev) - dtest(hintedhandoff_additional_test.py) Refs: #8624 Closes #8646	2021-06-22 15:58:56 +02:00
Pavel Emelyanov	92a4278cd1	hints: Drop storage service from managers The storage service pointer is only used so (un)subscribe to (from) lifecycle events. Now the subscription is gone, so can the storage service pointer. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-06-17 15:09:36 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Piotr Dulikowski	60ac68b7a2	hints/resource_manager: add comments to register_manager Adds more comments to resource_manager::register_manager in order to better explain what this function is doing.	2020-11-19 16:34:37 +01:00
Piotr Dulikowski	c0c10b918c	hints/resource_manager: fix indentation Fixes indentation in prepare_per_device_limits.	2020-11-19 16:34:37 +01:00
Piotr Dulikowski	ead6a3f036	hints/resource_manager: improve mutual exclusion This commit causes start, stop and register_manager methods of the resource_manager to be serialized with respect to each other using the _operation_lock. Those function modify internal state, so it's best if they are protected with a semaphore. Additionally, those function are not going to be used frequently, therefore it's perfectly fine to protect them in such a coarse manner. Now, space_watchdog has a dedicated lock for serializing its on_timer logic with resource_manager::register_manager. The reason for separate lock is that resource_manager::stop cannot use the same lock as the space_watchdog - otherwise a situation could occur in which space_watchdog waits for semaphore units held by resource_manager::stop(), and resource_manager::stop() waits until the space_watchdog stops its asynchronous event loop.	2020-11-19 16:34:37 +01:00
Piotr Dulikowski	362aebee7b	hints/resource_manager: correct prepare_per_device_limits usage The resource_manager::prepare_per_device_limits function calculates disk quota for registered hints managers, and creates an association map: from a storage device id to those hints manager which store hints on that device (_per_device_limits_map) This function was used with an assumption that it is idempotent - which is a wrong assumption. In resource_manager::register_manager, if the resource_manager is already started, prepare_per_device_limits would be called, and those hints managers which were previously added to the _per_device_limits_map would be added again. This would cause the space used by those managers to be calculated twice, which would artificially lower the limit which we impose on the space hints are allowed to occupy on disk. This patch fixes this problem by changing the prepare_per_device_limits function to operate on a hints manager passed by argument. Now, we make sure that this function is called on each hints manager only once.	2020-11-19 16:34:37 +01:00
Piotr Dulikowski	a4f03d72b3	hints/resource_manager: allow registering managers after start This change modifies db::hints::resource_manager so that it is now possible to add hints::managers after it was started. This change will make it possible to register the regular hints manager later in runtime, if it wasn't enabled at boot time.	2020-11-17 10:15:47 +01:00
Piotr Sarna	180a1505fd	hints: track resource_manager sending queue length The number of tasks waiting for a hint to be sent is now tracked.	2020-08-11 17:43:53 +02:00
Benny Halevy	a96087165a	hints: get_device_id: use seastar file_stat This avoids potential use-after-move, since undefined c++ sequencing order may std::move(f) in the lambda capture before evaluating f.stat(). Also, this makes use of a more generic library function that doesn't require to open and hold on to the file in the application. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20200514152054.162168-1-bhalevy@scylladb.com>	2020-05-15 10:11:45 +02:00
Avi Kivity	88ade3110f	treewide: replace calls to engine().some_api() with some_api() This removes the need to include reactor.hh, a source of compile time bloat. In some places, the call is qualified with seastar:: in order to resolve ambiguities with a local name. Includes are adjusted to make everything compile. We end up having 14 translation units including reactor.hh, primarily for deprecated things like reactor::at_exit(). Ref #1	2020-04-05 12:46:04 +03:00
Avi Kivity	1799cfa88a	logalloc: use namespace-scope seastar::idle_cpu_handler and related rather than reactor scope This allows us to drop a #include <reactor.hh>, reducing compile time. Several translation units that lost access to required declarations are updated with the required includes (this can be an include of reactor.hh itself, in case the translation unit that lost it got it indirectly via logalloc.hh) Ref #1.	2020-04-05 12:45:08 +03:00
Pavel Emelyanov	d1775dd701	utils: Move disk-error-handler into it The disk-error-handler is purely auxiliary thing that helps propagating IO errors to the rest of the code. It well deserves not sitting in the root namespace. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20200207112443.18475-1-xemul@scylladb.com>	2020-02-09 17:26:52 +02:00
Piotr Sarna	9c5a5a5ac2	treewide: add names to semaphores By default, semaphore exceptions bring along very little context: either that a semaphore was broken or that it timed out. In order to make debugging easier without introducing significant runtime costs, a notion of named semaphore is added. A named semaphore is simply a semaphore with statically defined name, which is present in its errors, bringing valuable context. A semaphore defined as: auto sem = semaphore(0); will present the following message when it breaks: "Semaphore broken" However, a named semaphore: auto named_sem = named_semaphore(0, named_semaphore_exception_factory{"io_concurrency_sem"}); will present a message with at least some debugging context: "Semaphore broken: io_concurrency_sem" It's not much, but it would really help in pinpointing bugs without having to inspect core dumps. At the same time, it does not incur any costs for normal semaphore operations (except for its creation), but instead only uses more CPU in case an error is actually thrown, which is considered rare and not to be on the hot path. Refs #4999 Tests: unit(dev), manual: hardcoding a failure in view building code	2019-11-26 15:14:21 +02:00
Vlad Zolotarov	d253846c91	hinted handoff: fix a race on a directory removal between space_watchdog and drain_for() The endpoint directories scanned by space_watchdog may get deleted by the manager::drain_for(). If a deleted directory is given to a lister::scan_dir() this will end up in an exception and as a result a space_watchdog will skip this round and hinted handoff is going to be disabled (for all agents including MVs) for the whole space_watchdog round. Let's make sure this doesn't happen by serializing the scanning and deletion using end_point_hints_manager::file_update_mutex. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2019-08-20 11:46:46 -04:00
Vlad Zolotarov	b34c36baa2	hinted handoff: make taking file_update_mutex safe end_point_hints_manager::file_update_mutex is taken by space_watchdog but while space_watchdog is waiting for it the corresponding end_point_hints_manager instance may get destroyed by manager::drain_for() or by manager::stop(). This will end up in a use-after-free event. Let's change the end_point_hints_manager's API in a way that would prevent such an unsafe locking: - Introduce the with_file_update_mutex(). - Make end_point_hints_manager::file_update_mutex() method private. Fixes #4685 Fixes #4836 Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2019-08-20 11:26:19 -04:00
Benny Halevy	ff4d8b6e85	treewide: use std::filesystem Rather than {std::experimental,boost,seastar::compat}::filesystem On Sat, 2019-03-23 at 01:44 +0200, Avi Kivity wrote: > The intent for seastar::compat was to allow the application to choose > the C++ dialect and have seastar follow, rather than have seastar choose > the types and have the application follow (as in your patch). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-03-28 14:21:10 +02:00
Benny Halevy	857ff4f59a	database: directly use std::experimental::filesystem::path for lister::path Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2018-12-02 22:02:10 +02:00
Benny Halevy	585ac6e641	database: use std::experimental::filesystem::path for lister::path We would like to get rid of boost::filesystem and gradually replace it with std::experimental::filesystem. TODO: using namespace fs = std::experimental::filesystem, use fs::path directly, rather than lister::path Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2018-12-02 22:02:10 +02:00
Vlad Zolotarov	aca0882a3f	hinted handoff: enable storing hints before starting messaging_service When messaging_service is started we may immediately receive a mutation from another node (e.g. in the MV update context). If hinted handoff is not ready to store hints at that point we may fail some of MV updates. We are going to resolve this by start()ing hints::managers before we start messaging_service and blocking hints replaying until all relevant objects are initialized. Refs #3828 Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2018-10-18 16:49:58 -04:00
Duarte Nunes	6dcb7a39d4	db/hints/manager: Move decision about blocking hints to the manager The space_watchdog enables or disables hints for the managers associated with a particular device. We encapsulate this decision inside the hints::managers by introducing the update_backlog() function. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-10-16 20:35:00 +01:00
Duarte Nunes	207c9c8e38	db/hints/resource_manager: Correctly account resources in space_watchdog A db::hints::resource_manager manages the resources for one or two db::hints::managers. Each of these can be using the same or different devices. The db::hints::space_watchdog periodically checks whether each manager is within their resource allocation, and if not disables it. The watchdog iterates over the managers and accounts for the total size they are using. This is wrong, since it can account in the same variable the size consumed by managers using different devices. We fix this while taking advantage of the fact that on_timer is now called in the context of a seastar::thread, instead of using future combinators. Fixes #3821 Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-10-16 20:34:54 +01:00
Duarte Nunes	25d266bdc1	db/hints/resource_manager: Replace timer with seastar::thread Will make on_timer() much simpler to allow fixing a bug in subsequent patches. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-10-16 20:32:16 +01:00
Duarte Nunes	278aa13bb0	db/hints/resource_manager: Ensure managers are correctly registered Registering a manager for a new device used std::unordered_map::emplace(), which may not insert the specified value if one with the same key has already been added. This could happen if both managers were using the same device and the fiber deferred in-between adding them. Found during code reading. Could cause hints to not be disabled for an overloaded manager. Fixes #3822 Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-10-16 20:32:16 +01:00
Duarte Nunes	9e3b09cf48	db/hints/resource_manager: Fix formatting Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-10-16 20:32:16 +01:00
Piotr Sarna	828497ad19	hints: amend a comment in device limits To make the comment less confusing, 'group of managers' is used instead of 'device'. Refs #3516 Reported-by: Vlad Zolotarov <vladz@scylladb.com> Signed-off-by: Piotr Sarna <sarna@scylladb.com> Message-Id: <60c9ab6b47195570f7ce7dff9556e3739b7ae00f.1529862547.git.sarna@scylladb.com>	2018-06-24 19:14:59 +01:00
Piotr Sarna	8b43ac3a57	hints: reserve more space for dedicated storage Reserving 10% of space for hints managers makes sense if the device is shared with other components (like /data or /commitlog). But, if hints directory is mounted on a dedicated storage, it makes sense to reserve much more - 90% was chosen as a sane limit. Whether storage is 'dedicated' or not is based on a simple check if given hints directory is a mount point. Fixes #3516 Signed-off-by: Piotr Sarna <sarna@scylladb.com>	2018-06-22 10:27:00 +02:00
Piotr Sarna	32f86ca61e	hints: add is_mountpoint function A helper function that checks whether a path is also a mount point is added. Signed-off-by: Piotr Sarna <sarna@scylladb.com>	2018-06-22 10:26:52 +02:00
Piotr Sarna	b6c1b8c5ef	hints: make space_watchdog device-aware Instead of having one static space limit for all directories, space_watchdog now keeps a per-device limit, shared among hints managers residing on the same disks. References #3516 Signed-off-by: Piotr Sarna <sarna@scylladb.com>	2018-06-22 10:26:45 +02:00

1 2

54 Commits