scylladb

Author	SHA1	Message	Date
Kefu Chai	3e84d43f93	treewide: use seastar::format() or fmt::format() explicitly before this change, we rely on `using namespace seastar` to use `seastar::format()` without qualifying the `format()` with its namespace. this works fine until we changed the parameter type of format string `seastar::format()` from `const char*` to `fmt::format_string<...>`. this change practically invited `seastar::format()` to the club of `std::format()` and `fmt::format()`, where all members accept a templated parameter as its `fmt` parameter. and `seastar::format()` is not the best candidate anymore. despite that argument-dependent lookup (ADT for short) favors the function which is in the same namespace as its parameter, but `using namespace` makes `seastar::format()` more competitive, so both `std::format()` and `seastar::format()` are considered as the condidates. that is what is happening scylladb in quite a few caller sites of `format()`, hence ADT is not able to tell which function the winner in the name lookup: ``` /__w/scylladb/scylladb/mutation/mutation_fragment_stream_validator.cc:265:12: error: call to 'format' is ambiguous 265 \| return format("{} ({}.{} {})", _name_view, s.ks_name(), s.cf_name(), s.id()); \| ^~~~~~ /usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/format:4290:5: note: candidate function [with _Args = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>] 4290 \| format(format_string<_Args...> __fmt, _Args&&... __args) \| ^ /__w/scylladb/scylladb/seastar/include/seastar/core/print.hh:143:1: note: candidate function [with A = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>] 143 \| format(fmt::format_string<A...> fmt, A&&... a) { \| ^ ``` in this change, we change all `format()` to either `fmt::format()` or `seastar::format()` with following rules: - if the caller expects an `sstring` or `std::string_view`, change to `seastar::format()` - if the caller expects an `std::string`, change to `fmt::format()`. because, `sstring::operator std::basic_string` would incur a deep copy. we will need another change to enable scylladb to compile with the latest seastar. namely, to pass the format string as a templated parameter down to helper functions which format their parameters. to miminize the scope of this change, let's include that change when bumping up the seastar submodule. as that change will depend on the seastar change. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-09-11 23:21:40 +03:00
Pavel Emelyanov	4b86eede1f	task_manager: Print task ttl on start (for debugging) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-08-22 14:08:21 +03:00
Avi Kivity	aa1270a00c	treewide: change assert() to SCYLLA_ASSERT() assert() is traditionally disabled in release builds, but not in scylladb. This hasn't caused problems so far, but the latest abseil release includes a commit [1] that causes a 1000 insn/op regression when NDEBUG is not defined. Clearly, we must move towards a build system where NDEBUG is defined in release builds. But we can't just define it blindly without vetting all the assert() calls, as some were written with the expectation that they are enabled in release mode. To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT() macro in utils/assert.hh. This macro is always defined and is not conditional on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release mode. [1] `66ef711d68` Closes scylladb/scylladb#20006	2024-08-05 08:23:35 +03:00
Aleksandra Martyniuk	20ba7ceff9	tasks: api: add virtual tasks support to get_tasks task_manager/list_module_tasks/{module} starts supporting virtual tasks, which means that their stats will also be shown for users. Additional task_kind param is added to indicate whether the task is virutal (cluster-wide) or regular (node-wide). Support in other paths will be added in following patches.	2024-07-23 13:35:01 +02:00
Aleksandra Martyniuk	1d85b319e0	tasks: add task_handler to hide task and virtual_task differences from user Contrary to regular tasks, which are per-operation, virtual tasks are associated with the whole group of operations. There may be many operations of each group performed at the same time. Info about each running operation will be shown to a user through the API. For virtual tasks, task manager imitates a regular task covering each operation, but task_manager::tasks aren't actually created in the memory. Instead, information (e.g. status) about the operation is retrieved from associated service and passed to a user. To hide most of the differences from user, task_handler class is created. Task handler performs appropriate actions depending on task's kind. However, users need to stay conscious about the kind of task, because: - get_task_status and wait_task do not unregister virtual tasks; - time for which a virtual tasks stays in task manager depends on associated service and tasks' implementation; - number of virtual task's children shown by get_tasks doesn't have to be monotonous. API is modified to use task_handler. API-specific classes are moved to task_handler.{cc,hh}.	2024-07-23 13:35:01 +02:00
Aleksandra Martyniuk	abde7ba271	tasks: modify invoke_on_task Modify task_manager::invoke_on_task to also check virtual tasks.	2024-07-23 13:35:01 +02:00
Aleksandra Martyniuk	6029936665	tasks: implement task_manager::virtual_task::impl::get_children Return a vector of task_identity of all children of a virtual task in a cluster.	2024-07-23 13:35:01 +02:00
Aleksandra Martyniuk	9de8d4b5b0	tasks: keep virtual tasks in task manager Virtual tasks are kept in task manager together with regular tasks. All virtual tasks are stored on shard 0. task_manager::module::make_task is modified to consider virtual tasks as possible parents.	2024-07-23 13:35:01 +02:00
Aleksandra Martyniuk	00cfc49d18	tasks: introduce task_manager::virtual_task A virtual task is a new kind of task supported by task manager, which covers cluster-wide operations. From users' perspective virtual tasks behave similarly to task_manager::tasks. The API side of virtual tasks will be covered in the following patches. Contrary to task_manager::task, virtual task does not update its fields proactively. Moreover, no object is kept in memory for each individual virtual task's operation. Instead a service (or services) is queried on API user's demand to learn about the status of running operation. Hence the name. task_manager::virtual_task is responsible for a whole group of virtual tasks, i.e. for tracking and generating statuses of all operations of similar type. To enable tracking of some kind of operations, one needs to override task_manager::virtual_task::impl and provide implementations of the methods returning appropriate information about the operations. task_manager::virtual_task must be kept on shard 0. Similarly to task_manager::tasks, virtual tasks can have child tasks, responsible for tracking suboperations' progress. But virtual tasks cannot have parents - they are always roots in task trees. Some methods and structs will be implemented in later patches.	2024-07-23 13:35:01 +02:00
Aleksandra Martyniuk	50cb797d95	test: add test for abort while a task is being unregistered	2024-06-18 13:41:51 +02:00
Aleksandra Martyniuk	3463f495b1	tasks: fix tasks abort Currently if task_manager::task::impl::abort preempts before children are recursively aborted and then the task gets unregistered, we hit use after free since abort uses children vector which is no longer alive. Modify abort method so that it goes over all tasks in task manager and aborts those with the given parent. Fixes: #19304.	2024-06-18 13:39:29 +02:00
Aleksandra Martyniuk	a82a2f0624	tasks: unregister tasks with parents when they are finished Unregister children that are finished from task manager. They can be examined through they parents.	2024-05-31 10:27:09 +02:00
Aleksandra Martyniuk	e6c50ad2d0	tasks: fold finished tasks info their parents Currently, when a child task is unregistered, it is still kept by its parent. This leads to excessive memory usage, especially when the tasks are configured to be kept in task manager after they are finished (task_ttl_in_seconds). Introduce task_essentials struct which keeps only data necesarry for task manager API. When a task which has a parent is finished, a foreign pointer to it in its parent is replaced with respective task_essentials. Once a parent task is finished it is also folded into its parent (if it has one). Children details of a folded task are lost, unless they (or some of their subtrees) failed. That is, when a task is finished, we keep: - a root task (until it is unregistered); - task_essentials of root's direct children; - a path (of task_essentials) from root to each failed task (so that the reason of a failure could be examined).	2024-05-31 10:27:09 +02:00
Aleksandra Martyniuk	319e799089	tasks: make task_manager::task::impl::finish_failed noexcept	2024-05-31 10:27:09 +02:00
Aleksandra Martyniuk	6add9edf8a	tasks: change _children type Keep task children in a map. It's a preparation for further changes.	2024-05-31 10:27:09 +02:00
Kefu Chai	e62b29bab7	tasks: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17125	2024-02-02 15:20:40 +01:00
Aleksandra Martyniuk	6b2b384c83	tasks: don't keep internal root tasks after they complete	2024-01-09 13:13:54 +01:00
Aleksandra Martyniuk	6f13e55187	tasks: call release_resources when task is finished Call task_manager::task::impl::release_resources when task is finished instead of putting the responsibility on user. Closes scylladb/scylladb#16660	2024-01-09 11:41:54 +02:00
Aleksandra Martyniuk	9b9ea1193c	tasks: keep task's children in list If std::vector is resized its iterators and references may get invalidated. While task_manager::task::impl::_children's iterators are avoided throughout the code, references to its elements are being used. Since children vector does not need random access to its elements, change its type to std::list<foreign_task_ptr>, which iterators and references aren't invalidated on element insertion. Fixes: #16380. Closes scylladb/scylladb#16381	2023-12-13 10:47:27 +02:00
Aleksandra Martyniuk	c74b3ec596	tasks: fail if a task was aborted run() method of task_manager::task::impl does not have to throw when a task is aborted with task manager api. Thus, a user will see that the task finished successfully which makes it inconsistent. Finish a task with a failure if it was aborted with task manager api.	2023-11-24 15:45:00 +01:00
Botond Dénes	0ae1335daa	Revert "Merge 'compaction: abort compaction tasks' from Aleksandra Martyniuk" This reverts commit `11cafd2fc8`, reversing changes made to `2bae14f743`. Reverting because this series causes frequent CI failures, and the proposed quickfix causes other failures of its own. Fixes: #16113	2023-11-22 17:44:07 +02:00
Aleksandra Martyniuk	2a9ee59cc4	tasks: fail if a task was aborted run() method of task_manager::task::impl does not have to throw when a task is aborted with task manager api. Thus, a user will see that the task finished successfully which makes it inconsistent. Finish a task with a failure if it was aborted with task manager api.	2023-11-13 16:06:20 +01:00
Botond Dénes	1cccc86813	Revert "Merge 'compaction: abort compaction tasks' from Aleksandra Martyniuk" This reverts commit `2860d43309`, reversing changes made to `a3621dbd3e`. Reverting because rest_api.test_compaction_task started failing after this was merged. Fixes: #16005	2023-11-09 10:43:11 +01:00
Botond Dénes	2860d43309	Merge 'compaction: abort compaction tasks' from Aleksandra Martyniuk Compaction tasks which do not have a parent are abortable through task manager. Their children are aborted recursively. Compaction tasks of the lowest level are aborted using existing compaction task executors stopping mechanism. Closes scylladb/scylladb#15083 * github.com:scylladb/scylladb: test: test abort of compaction task that isn't started yet test: test running compaction task abort tasks: fail if a task was aborted compaction: abort task manager compaction tasks	2023-11-08 08:45:16 +02:00
Pavel Emelyanov	7fa7a9495d	task_manager: Don't leave task_ttl uninitialized When task_manager is constructed without config (tests) its task_ttl is left uninitialized (i.e. -- random number gets in there). This results in tasks hanging around being registered for infinite amount of time making long-living task manager look hanged. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#15859	2023-10-30 20:15:05 +02:00
Aleksandra Martyniuk	b91064bd2a	tasks: fail if a task was aborted run() method of task_manager::task::impl does not have to throw when a task is aborted with task manager api. Thus, a user will see that the task finished successfully which makes it inconsistent. Finish a task with a failure if it was aborted with task manager api.	2023-10-19 10:47:20 +02:00
Aleksandra Martyniuk	198119f737	compaction: add get_progress method to compaction_task_impl compaction_task_impl::get_progress is used by the lowest level compaction tasks which progress can be taken from compaction_progress_monitor.	2023-10-12 17:16:05 +02:00
Aleksandra Martyniuk	f42be12f43	repair: release resources of shard_repair_task_impl Before integration with task manager the state of one shard repair was kept in repair_info. repair_info object was destroyed immediately after shard repair was finished. In an integration process repair_info's fields were moved to shard_repair_task_impl as the two served the similar purposes. Though, shard_repair_task_impl isn't immediately destoyed, but is kept in task manager for task_ttl seconds after it's complete. Thus, some of repair_info's fields have their lifetime prolonged, which makes the repair state change delayed. Release shard_repair_task_impl resources immediately after shard repair is finished. Fixes: #15505. Closes scylladb/scylladb#15506	2023-09-26 17:09:47 +03:00
Aleksandra Martyniuk	d799adc536	tasks: change task_manager::task::impl::is_internal() Most of the time only the roots of tasks tree should be non internal. Change default implementation of is_internal and delete overrides consistent with it. Closes scylladb/scylladb#15353	2023-09-26 14:49:49 +03:00
Benny Halevy	a5b7f1a275	task_manager: task: start: fixup indentation Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-05 09:17:25 +03:00
Benny Halevy	f9a7635390	task_manager: module: make_task: enter gate when the task is created Passing the gate_closed_exception to the task promise in start() ends up with abandoned exception since no-one is waiting for it. Instead, enter the gate when the task is made so it will fail make_task if the gate is already closed. Fixes scylladb/scylladb#15211 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-05 09:17:25 +03:00
Benny Halevy	51792d2292	task_manaer: module: stop: request abort Have a private about_source for every module and request abort on stop() to signal all outstanding tasks to abort (especially when they are sleeping for the task_ttl). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-05 09:17:25 +03:00
Benny Halevy	d7205db863	task_manager: task::impl: subscribe to module about_source Rather to the top-level task_manager about_source, to provide separation between task_manager modules so each one can be aborted and stopped independentally of the others (in the next patch). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-09-05 09:17:25 +03:00
Aleksandra Martyniuk	5e31ca7d20	tasks: api: show tasks' scopes To make manual analysis of task manager tasks easier, task_status and task_stats contain operation scope (e.g. shard, table). Closes #15172	2023-08-29 11:32:16 +03:00
Kefu Chai	63b32cbdb4	tasks: s/stoppping/stopping/ fix a typo Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #15103	2023-08-21 22:28:38 +03:00
Aleksandra Martyniuk	d624be4e6b	tasks: modify task_manager::task::impl::get_progress method Modify task_manager::task::impl::get_progress method so that, whenever relevant, progress is calculated based on children's progress. Otherwise progress indicates only whether the task is finished or not.	2023-06-29 11:30:26 +02:00
Aleksandra Martyniuk	0278b21e76	tasks: add is_complete method Add is_complete method to task_manager::task::impl and task_manager::task.	2023-06-29 11:02:14 +02:00
Kefu Chai	0cb842797a	treewide: do not define/capture unused variables these warnings are found by Clang-17 after removing `-Wno-unused-lambda-capture` and '-Wno-unused-variable' from the list of disabled warnings in `configure.py`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-02-15 22:57:18 +02:00
Aleksandra Martyniuk	6233823cc7	tasks: add task_manager constructor without arguments Sometimes, e.g. for tests, we may need to create task_manager without main-specific arguments.	2023-02-03 13:52:30 +01:00
Botond Dénes	2612f98a6c	Merge 'Abort repair tasks' from Aleksandra Martyniuk Aborting of repair operation is fully managed by task manager. Repair tasks are aborted: - on shutdown; top level repair tasks subscribe to global abort source. On shutdown all tasks are aborted recursively - through node operations (applies to data_sync_repair_task_impls and their descendants only); data_sync_repair_task_impl subscribes to node_ops_info abort source - with task manager api (top level tasks are abortable) - with storage_service api and on failure; these cases were modified to be aborted the same way as the ones from above are. Closes #12085 * github.com:scylladb/scylladb: repair: make top level repair tasks abortable repair: unify a way of aborting repair operations repair: delete sharded abort source from node_ops_info repair: delete unused node_ops_info from data_sync_repair_task_impl repair: delete redundant abort subscription from shard_repair_task_impl repair: add abort subscription to data sync task tasks: abort tasks on system shutdown	2023-01-05 15:21:35 +01:00
Botond Dénes	2ef71e9c70	Merge 'Improve verbosity of task manager api' from Aleksandra Martyniuk The PR introduces changes to task manager api: - extends tasks' list returned with get_tasks with task type, keyspace, table, entity, and sequence number - extends status returned with get_task_status and wait_task with a list of children's ids Closes #12338 * github.com:scylladb/scylladb: api: extend status in task manager api api: extend get_tasks in task manager api	2023-01-03 10:39:41 +02:00
Michał Chojnowski	5e79d6b30b	tasks: task_manager: move invoke_on_task<> to .hh invoke_on_task is used in translation units where its definition is not visible, yet it has no explicit instantiations. If the compiler always decides to inline the definition, not to instantiate it implicitly, linking invoke_on_task will fail. (It happened to me when I turned up inline-threshold). Fix that. Closes #12387	2022-12-28 10:55:43 +02:00
Aleksandra Martyniuk	ee13a5dde8	api: extend status in task manager api Status of tasks returned with get_task_status and wait_task is extended with the list of ids of child tasks.	2022-12-21 10:54:56 +01:00
Aleksandra Martyniuk	2b35d7df1b	tasks: abort tasks on system shutdown When system shutdowns, all task manager's top level tasks are aborted. Responsibility for aborting child tasks is on their parents.	2022-12-19 15:57:35 +01:00
Aleksandra Martyniuk	5bc09daa7a	tasks: repair: api: remove type attribute from task_manager::task::status	2022-12-15 10:49:09 +01:00
Aleksandra Martyniuk	8d5377932d	tasks: add type() method to task_manager::task::impl	2022-12-15 10:41:58 +01:00
Aleksandra Martyniuk	8bc0af9e34	repair: fix double start of data sync repair task Currently, each data sync repair task is started (and hence run) twice. Thus, when two running operations happen within a time frame long enough, the following situation may occur: - the first run finishes - after some time (ttl) the task is unregistered from the task manager - the second run finishes and attempts to finish the task which does not exist anymore - memory access causes a segfault. The second call to start is deleted. A check is added to the start method to ensure that each task is started at most once. Fixes: #12089 Closes #12090	2022-11-29 00:00:10 +02:00
Aleksandra Martyniuk	9a3d114349	tasks: move methods from task_manager to source file Methods from tasks::task_manager and nested classes are moved to source file. Closes #12064	2022-11-27 15:09:28 +02:00
Aleksandra Martyniuk	ec86410094	task_manager: test api layer implementation The implementation of a test api that helps testing task manager api. It provides methods to simulate the operations that can happen on modules and theirs task. Through the api user can: register and unregister the test module and the tasks belonging to the module, and finish the tasks with success or custom error.	2022-09-09 14:29:28 +02:00
Aleksandra Martyniuk	2439e55974	task_manager: create task manager object Implementation of a task manager that allows tracking and managing asynchronous tasks. The tasks are represented by task_manager::task class providing members common to all types of tasks. The methods that differ among tasks of different module can be overriden in a class inheriting from task_manager::task::impl class. Each task stores its status containing parameters like id, sequence number, begin and end time, state etc. After the task finishes, it is kept in memory for configurable time or until it is unregistered. Tasks need to be created with make_task method. Each module is represented by task_manager::module type and should have an access to task manager through task_manager::module methods. That allows to easily separate and collectively manage data belonging to each module.	2022-09-09 14:29:28 +02:00

50 Commits