scylladb

Author	SHA1	Message	Date
Aleksandra Martyniuk	2d16083ba6	tasks: fix indentation	2026-03-18 15:37:24 +01:00
Aleksandra Martyniuk	1fbf3a4ba1	tasks: do not fail the wait request if rpc fails During decommission, we first mark a topology request as done, then shut down a node and in the following steps we remove node from the topology. Thus, finished request does not imply that a node is removed from the topology. Due to that, in node_ops_virtual_task::wait, while gathering children from the whole cluster, we may hit the connection exception - because a node is still in topology, even though it is down. Modify the get_children method to ignore the exception and warn about the failure instead.	2026-03-18 15:37:24 +01:00
Aleksandra Martyniuk	d4fdeb4839	tasks: pass token_metadata_ptr to task_manager::virtual_task::impl::get_children In get_children we get the vector of alive nodes with get_nodes. Yet, between this and sending rpc to those nodes there might be a preemption. Currently, the liveness of a node is checked once again before the rpcs (only with gossiper not in topology - unlike get_nodes). Modify get_children, so that it keeps a token_metadata_ptr, preventing topology from changing between get_nodes and rpcs. Remove test_get_children as it checked if the get_children method won't fail if a node is down after get_nodes - which cannot happen currently.	2026-03-18 15:37:24 +01:00
Aleksandra Martyniuk	100ccd61f8	tasks: increase tasks_vt_get_children timeout test_node_ops_tasks.py::test_get_children fails due to timeout of tasks_vt_get_children injection in debug mode. Compared to a successful run, no clear root cause stands out. Extend the message timeout of tasks_vt_get_children from 10s to 60s. Fixes: #28295. Closes scylladb/scylladb#28599	2026-02-18 11:39:19 +03:00
Tomasz Grabiec	7446eb7e8d	tasks, topology: Make pending node operations abortable We want to be able to cancel decommission when it's still in the tablet draining phase. Such a request is in a pending and paused state, and can be safely canceled. We set the node's "draining" flag back to false.	2026-01-18 15:36:05 +01:00
Aleksandra Martyniuk	9039dfa4a5	tasks: service: add global_topology_request_virtual_task Add a service::topo::global_topology_request_virtual_task, which covers the replication factor changes. Currently, the global_topology_request_virtual_task can be aborted only if it is paused. The progress of the rf change isn't counted.	2025-12-16 13:31:22 +01:00
Botond Dénes	30a3f61fa0	Merge 'compaction: handle exception in expected_total_workload' from Aleksandra Martyniuk expected_total_workload methods of scrub compaction tasks create a vector of table_info based on table names. If any table was already dropped, then the exception is thrown. It leaves table_info in corrupted state and node crashes with `free(): invalid size`. Return std::nullopt if an exception was thrown to indicate that total workload cannot be found. Fixes: #25941. No release branches affected Closes scylladb/scylladb#25944 * github.com:scylladb/scylladb: tasks: get progress of failed task based on children compaction: handle exception in expected_total_workload	2025-09-17 15:10:19 +03:00
Aleksandra Martyniuk	3324f08e9c	tasks: get progress of failed task based on children Currently, for failed tasks task_manager::task::impl::get_progress attempts to find expected_total_workload. However, if the task has finished long time ago, the state might have totally changed, e.g. some tables might have been dropped or have changed their sizes. Due to that, the result of expected_total_workload might be irrelevant. Count the progress of a finish task based on children only, regardless whether the task has succeeded or failed.	2025-09-16 17:15:01 +02:00
Aleksandra Martyniuk	55fde70f8d	api: tasks: task_manager: keep children identities in chunked_{array,vector} task_status contains a vector of children identities. If the number of children is large, we may hit oversized allocation. Change all types of children-related containers to chunked_vector. Modify the children type returned from task manager API. Fixes: scylladb#25795. Closes scylladb/scylladb#25923	2025-09-15 08:44:16 +03:00
Botond Dénes	6116f9e11b	Merge 'Compaction tasks progress' from Aleksandra Martyniuk Determine the progress of compaction tasks that have children. The progress of a compaction task is calculated using the default get_progress method. If the expected_total_workload method is implemented, the default progress is computed as: (sum of child task progresses) / (expected total workload) If expected_total_workload is not defined, progress is estimated based on children progresses. However, in this case, the total progress may increase over time as the task executes. All compaction tasks, except for reshape tasks, implement the expected_children_number method. To compute expected_total_workload, iterate over all SSTables covered by the task and sum their sizes. Note that expected_total_workload is just an approximation and the real workload may differ if SStables set for the keyspace/table/compaction group changes. Reshape tasks are an exception, as their scope is determined during execution. Hence, for these tasks expected_total_workload isn't defined and their progress (both total and completed) is determined based on currently created children. Fixes: https://github.com/scylladb/scylladb/issues/8392. Fixes: https://github.com/scylladb/scylladb/issues/6406. Fixes: https://github.com/scylladb/scylladb/issues/7845. New feature, no backport needed Closes scylladb/scylladb#15158 * github.com:scylladb/scylladb: test: add compaction task progress test compaction: set progress unit for compaction tasks compaction: find expected workload for reshard tasks compaction: find expected workload for global cleanup compaction tasks compaction: find expected workload for global major compaction tasks compaction: find expected workload for keyspace compaction tasks compaction: find expected workload for shard compaction tasks compaction: find expected workload for table compaction tasks compaction: return empty progress when compaction_size isn't set compaction: update compaction_data::compaction_size at once tasks: do not check expected workload for done task	2025-09-03 13:23:42 +03:00
Aleksandra Martyniuk	7fe1ad1f63	tasks: return task::impl from make_and_start_task Currently, make_and_start_task returns a pointer to task_manager::task that hides the implementation details. If we need to access the implementation (e.g. because we want a task to "return" a value), we need to make and start task step by step openly. Return task_manager::task::impl from make_and_start_task. Use it where possible. Fixes: https://github.com/scylladb/scylladb/issues/22146.	2025-08-29 17:12:07 +02:00
Aleksandra Martyniuk	bd28c50d84	compaction: return empty progress when compaction_size isn't set Currently, progress of compaction task executors is reported in bytes. However, if compaction_size isn't set for compaction task executor, the executor's progress is shown as 1/1 (if it has finished) or 0/1 (otherwise). In the following patches, the progress of executors' parent task will be found based on its children. Hence, to avoid mixing different progress units, the binary progress is no longer used. Return empty progress when compaction_size isn't set. Drop task_manager::task::impl::get_binary_progress as it's no longer used.	2025-08-27 17:51:21 +02:00
Aleksandra Martyniuk	836159b0c3	tasks: do not check expected workload for done task task_manager::task::impl::get_progress checks the expected total workload of a task to find its progress. If a task has finished successfully then its workload is equal to the sum of total progresses of its children. Do not call expected_total_workload for tasks that have finished successfully.	2025-08-27 17:48:25 +02:00
Aleksandra Martyniuk	a7ee2bbbd8	tasks: do not use binary progress for task manager tasks Currently, progress of a parent task depends on expected_total_workload, expected_children_number, and children progresses. Basically, if total workload is known or all children have already been created, progresses of children are summed up. Otherwise binary progress is returned. As a result, two tasks of the same type may return progress in different units. If they are children of the same task and this parent gathers the progress - it becomes meaningless. Drop expected_children_number as we can't assume that children are able to show their progresses. Modify get_progress method - progress is calculated based on children progresses. If expected_total_workload isn't specified, the total progress of a task may grow. If expected_total_workload isn't specified and no children are created, empty progress (0/0) is returned. Fixes: https://github.com/scylladb/scylladb/issues/24650. Closes scylladb/scylladb#25113	2025-07-25 10:45:32 +03:00
Aleksandra Martyniuk	b5026edf49	tasks: change _finished_children type Parent task keeps a vector of statuses (task_essentials) of its finished children. When the children number is large - for example because we have many tables and a child task is created for each table - we may hit oversize allocation while adding a new child essentials to the vector. Keep task_essentails of children in chunked_vector. Fixes: #25040. Closes scylladb/scylladb#25064	2025-07-22 12:39:00 +02:00
Aleksandra Martyniuk	e178bd7847	test: add test for getting tasks children Add test that checks whether the children of a virtual task will be properly gathered if a node is down.	2025-04-17 13:48:44 +02:00
Aleksandra Martyniuk	53e0f79947	tasks: check whether a node is alive before rpc Check whether a node is alive before making an rpc that gathers children infos from the whole cluster in virtual_task::impl::get_children.	2025-04-17 12:51:22 +02:00
Benny Halevy	bfdd8a98ca	task_manager: module: use named gate Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-04-12 11:29:48 +03:00
Kefu Chai	4c1f1baab4	tasks: make release_resources() a coroutine Convert tasks::task_manager::task::impl::release_resources() to a coroutine to prepare for upcoming changes that will implement asynchronous resource release. This is a preparatory refactoring that enables future coroutine-based implementation of resource cleanup logic. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2025-02-14 11:13:58 +08:00
Aleksandra Martyniuk	fe02555c46	tasks: drop task_manager::config::broadcast_address as it's unused	2025-02-05 10:11:54 +01:00
Aleksandra Martyniuk	e16b413568	tasks: replace ip with host_id in task_identity Replace ip with host_id in task_identity. Translate host_id to ip in task manager api handlers. Use host_id in send_tasks_get_children.	2025-02-05 10:11:52 +01:00
Aleksandra Martyniuk	4470c2f6d3	tasks: keep host_id in task_manager Keep host_id of a node in task manager. If host_id wasn't resolved yet, task manager will keep an empty id. It's a preparation for the following changes.	2025-02-05 10:10:29 +01:00
Aleksandra Martyniuk	7969e98b4e	tasks: move tasks_get_children to IDL	2025-02-05 10:10:29 +01:00
Aleksandra Martyniuk	683176d3db	tasks: add shard, start_time, and end_time to task_stats task_stats contains short info about a task. To get a list of task_stats in the module, one needs to request /task_manager/list_module_tasks/{module}. To make identification and navigation between tasks easier, extend task_stats to contain shard, start_time, and end_time. Closes scylladb/scylladb#22351	2025-02-04 12:11:24 +02:00
Aleksandra Martyniuk	18cc79176a	api: task_manager: do not unregister tasks on get_status Currently, /task_manager/task_status_recursive/{task_id} and /task_manager/task_status/{task_id} unregister queries task if it has already finished. The status should not disappear after being queried. Do not unregister finished task when its status or recursive status is queried.	2025-01-27 11:23:45 +01:00
Aleksandra Martyniuk	14dcaecc29	tasks: children of virtual tasks aren't internal by default Currently, streaming_task_impl is the only existing child of any virtual task. It overrides the is_internal definition so that it is non-internal even though it has a parent. This should apply to all children of all virtual tasks. Modify task_manager::task::impl::is_internal so that children of virtual tasks aren't internal by default.	2025-01-10 10:03:08 +01:00
Aleksandra Martyniuk	5a948d3fac	tasks: initialize shard in task_info ctor Initialize shard in task_info constructor. All current usages do not care about the shard of an empty task_info. In the following patches we may need that for setting info about virtual task parent.	2025-01-10 10:03:08 +01:00
Aleksandra Martyniuk	24bbd161fd	tasks: add suspended task state Add suspended task state. It will be used for revoke resize requests.	2025-01-10 10:03:08 +01:00
Aleksandra Martyniuk	3f6b932362	tasks: add task_manager::get_nodes Move an implementation of node_ops::task_manager_module::get_nodes to task_manager::get_nodes, so that it can be reused by other modules.	2025-01-10 10:03:07 +01:00
Aleksandra Martyniuk	5dfac9290c	tasks: drop noexcept from module::get_nodes	2025-01-10 10:03:07 +01:00
Avi Kivity	f3eade2f62	treewide: relicense to ScyllaDB-Source-Available-1.0 Drop the AGPL license in favor of a source-available license. See the blog post [1] for details. [1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/	2024-12-18 17:45:13 +02:00
Avi Kivity	fe9fcdfe30	task_manager.hh: replace boost ranges with std ranges Standardize on one range library to reduce dependency load. Unfortunately, std::views::concat (the replacement for boost::join), is C++26 only. We use two separate inserts to the result vector to compensate, and rationalize it by saying that boost::join() is likely slow due to the need for type-erasure. Closes scylladb/scylladb#21834	2024-12-16 13:08:02 +02:00
Aleksandra Martyniuk	215a15d103	service: tasks: make get_table_id a method of virtual_task_hint	2024-12-11 15:17:08 +01:00
Aleksandra Martyniuk	0caffd67f8	service: tasks: extend virtual_task_hint Extend virtual_task_hint to contain task_type and tablet_id. These fields would be used by tablet_virtual_task in the following patches.	2024-12-11 15:15:28 +01:00
Kefu Chai	bab12e3a98	treewide: migrate from boost::adaptors::transformed to std::views::transform now that we are allowed to use C++23. we now have the luxury of using `std::views::transform`. in this change, we: - replace `boost::adaptors::transformed` with `std::views::transform` - use `fmt::join()` when appropriate where `boost::algorithm::join()` is not applicable to a range view returned by `std::view::transform`. - use `std::ranges::fold_left()` to accumulate the range returned by `std::view::transform` - use `std::ranges::fold_left()` to get the maximum element in the range returned by `std::view::transform` - use `std::ranges::min()` to get the minimal element in the range returned by `std::view::transform` - use `std::ranges::equal()` to compare the range views returned by `std::view::transform` - remove unused `#include <boost/range/adaptor/transformed.hpp>` - use `std::ranges::subrange()` instead of `boost::make_iterator_range()`, to feed `std::views::transform()` a view range. to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support. this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible. limitations: there are still a couple places where we are still using `boost::adaptors::transformed` due to the lack of a C++23 alternative for `boost::join()` and `boost::adaptors::uniqued`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21700	2024-12-03 09:41:32 +02:00
Aleksandra Martyniuk	409ed508cc	service: add tablet_virtual_task Add tablet_virtual_task, which covers tablet repair.	2024-11-28 11:42:38 +01:00
Aleksandra Martyniuk	898c8f4e24	tasks: utilize preliminary virtual task lookup When API user requests status of a virtual task, we first need to find which virtual_task instance tracks given operation. While doing this we gather some info regarding the task, but we don't utilize it. Add virtual_task_hint that keeps info that was gathered during virtual task lookup and pass it to virtual_task's methods so the info doesn't need to be retrieved twice.	2024-11-28 11:27:16 +01:00
Botond Dénes	ccb433d767	Merge 'tasks: add api_task_ttl for tasks started with API' from Aleksandra Martyniuk When users start an operation asynchronously with API, they are expected to check the operation's status. Hence, the status should be kept in task manager for reasonable time after the operation is done. The operations that are started internally usually don't need to stay in task manager for that long. Add api_task_ttl that will be used for tasks started with API. By default it's 1 hour. The time for which non-API tasks stay in task manager isn't changed. Fixes: #21499. Refs: #21425. No backport needed - previous versions may use task_ttl Closes scylladb/scylladb#21505 * github.com:scylladb/scylladb: test: add test to check user_task_ttl tasks: api: move make_task method docs: nodetool: update backup and restore commands docs docs: update task manager docs nodetool: add nodetool tasks user-ttl command node_ops: use user task ttl for node ops virtual task tasks: use user_task_ttl for tasks started by user api: task_manager: add /task_manager/user_ttl to get and set user task ttl tasks: add task_manager::task::is_user_task method tasks: keep updateable_value of task_ttl in task manager db: config: add user_task_ttl_seconds named value	2024-11-27 09:57:57 +02:00
Kefu Chai	a5ee0c896b	treewide: migrate from boost::adaptors::filtered to std::views::filter Modernize the codebase by replacing Boost range adaptors with C++23 standard library views, reducing external dependencies and leveraging modern C++ language features. Key Changes: - Replace `boost::adaptors::filtered` with `std::views::filter` - Remove `#include <boost/range/adaptor/filtered.hpp>` - Utilize standard library range views Motivation: - Reduce project's external dependency footprint - Leverage standard library's range and view capabilities - Improve long-term code maintainability - Align with modern C++ best practices Implementation Challenges and Considerations: 1. Range Conversion and Move Semantics - `std::ranges::to` adaptor requires rvalue references - Necessitated updates to variable and parameter constness - Example: `cql3/restrictions/statement_restrictions.cc` modified to remove `const` from `common` to enable efficient range conversion 2. Range Iteration and Mutation - Range views may mutate internal state during iteration - Cannot pass ranges by const reference in some scenarios - Solution: Pass ranges by rvalue reference to explicitly indicate state invalidation Limitations: - One instance of `boost::adaptors::filtered` temporarily preserved due to lack of a C++23 alternative for `boost::join()` - A comprehensive replacement will be addressed in a follow-up change This change is part of our ongoing effort to modernize the codebase, reducing external dependencies and adopting modern C++ practices. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21648	2024-11-26 14:26:50 +02:00
Aleksandra Martyniuk	ac6a07117a	test: add test to check user_task_ttl	2024-11-26 09:57:42 +01:00
Aleksandra Martyniuk	1712c93261	tasks: api: move make_task method task_manager::module::make_task method template is used only for test_task_impl. Move it to api/task_manager_test.cc and modify it to be test_task_impl-specific.	2024-11-26 09:57:42 +01:00
Aleksandra Martyniuk	6241d49b64	tasks: use user_task_ttl for tasks started by user	2024-11-25 14:21:53 +01:00
Aleksandra Martyniuk	292d00463a	tasks: add task_manager::task::is_user_task method	2024-11-25 14:21:53 +01:00
Aleksandra Martyniuk	16e204dfdb	tasks: keep updateable_value of task_ttl in task manager Drop task_ttl observer from task manager and use updateable_value.	2024-11-25 14:20:43 +01:00
Aleksandra Martyniuk	1bf073704c	db: config: add user_task_ttl_seconds named value Add user_task_ttl_seconds config option and keep the value in task manager. In the following patches tasks started by user will be kept in task manager for user_task_ttl_seconds after they are finished.	2024-11-25 14:16:06 +01:00
Botond Dénes	4bafaee523	Merge 'tasks: improve task_manager::lookup_virtual_task' from Aleksandra Martyniuk Currently, to find the operation with given id, all operations tracked by a virtual task are listed. This isn't necessary, since we only need info regarding one particular operation. Add a method to check whether a virtual task tracks the operation with the given id. No backport needed Closes scylladb/scylladb#20769 * github.com:scylladb/scylladb: tasks: delete virtual_task::get_ids method as it is unused tasks: improve task_manager::lookup_virtual_task	2024-11-01 13:44:04 +02:00
Pavel Emelyanov	7d8cc3ccc2	treewide,error_injection: Use inject(wait_for_message) overload Many places want to inject a handler that waits for external kick. Now there's convenience inject() method overload for this. It will result in extra messages in logs, but so far no code/test cares about it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-10-30 16:53:33 +03:00
Aleksandra Martyniuk	bc5b1f9a5d	tasks: delete virtual_task::get_ids method as it is unused	2024-10-30 12:25:47 +01:00
Aleksandra Martyniuk	9b5d69ae96	tasks: improve task_manager::lookup_virtual_task Currently, lookup_virtual_task gets the list of ids of all operations tracked by a virtual task and checks whether it contains given id. The list of all ids isn't required and the check whether one particular operation id is tracked by the virtual task may be quicker than listing all operations. Add virtual_task::contains method and use it in lookup_virtual_task.	2024-10-30 12:24:38 +01:00
Botond Dénes	31342ecb5d	Merge 'tasks: fix virtual tasks children' from Aleksandra Martyniuk Fix how regular tasks that have a virtual parent are created in task_manager::module::make_task: set sequence number of a task and subscribe to module's abort source. Fixes: #21278. Needs backport to 6.2 Closes scylladb/scylladb#21280 * github.com:scylladb/scylladb: tasks: fix sequence number assignment tasks: fix abort source subscription of virtual task's child	2024-10-28 08:59:40 +02:00

1 2 3

144 Commits