In get_children we get the vector of alive nodes with get_nodes.
Yet, between this and sending rpc to those nodes there might be
a preemption. Currently, the liveness of a node is checked once
again before the rpcs (only with gossiper not in topology - unlike
get_nodes).
Modify get_children, so that it keeps a token_metadata_ptr,
preventing topology from changing between get_nodes and rpcs.
Remove test_get_children as it checked if the get_children method
won't fail if a node is down after get_nodes - which cannot happen
currently.
Add a service::topo::global_topology_request_virtual_task, which
covers the replication factor changes.
Currently, the global_topology_request_virtual_task can be aborted
only if it is paused.
The progress of the rf change isn't counted.
task_status contains a vector of children identities. If the number
of children is large, we may hit oversized allocation.
Change all types of children-related containers to chunked_vector.
Modify the children type returned from task manager API.
Fixes: scylladb#25795.
Closesscylladb/scylladb#25923
Determine the progress of compaction tasks that have
children.
The progress of a compaction task is calculated using the default
get_progress method. If the expected_total_workload method is
implemented, the default progress is computed as:
(sum of child task progresses) / (expected total workload)
If expected_total_workload is not defined, progress is estimated based
on children progresses. However, in this case, the total progress may
increase over time as the task executes.
All compaction tasks, except for reshape tasks, implement the
expected_children_number method. To compute expected_total_workload,
iterate over all SSTables covered by the task and sum their sizes. Note
that expected_total_workload is just an approximation and the real workload
may differ if SStables set for the keyspace/table/compaction group changes.
Reshape tasks are an exception, as their scope is determined during
execution. Hence, for these tasks expected_total_workload isn't defined
and their progress (both total and completed) is determined based
on currently created children.
Fixes: https://github.com/scylladb/scylladb/issues/8392.
Fixes: https://github.com/scylladb/scylladb/issues/6406.
Fixes: https://github.com/scylladb/scylladb/issues/7845.
New feature, no backport needed
Closesscylladb/scylladb#15158
* github.com:scylladb/scylladb:
test: add compaction task progress test
compaction: set progress unit for compaction tasks
compaction: find expected workload for reshard tasks
compaction: find expected workload for global cleanup compaction tasks
compaction: find expected workload for global major compaction tasks
compaction: find expected workload for keyspace compaction tasks
compaction: find expected workload for shard compaction tasks
compaction: find expected workload for table compaction tasks
compaction: return empty progress when compaction_size isn't set
compaction: update compaction_data::compaction_size at once
tasks: do not check expected workload for done task
Currently, make_and_start_task returns a pointer to task_manager::task
that hides the implementation details. If we need to access
the implementation (e.g. because we want a task to "return" a value),
we need to make and start task step by step openly.
Return task_manager::task::impl from make_and_start_task. Use it
where possible.
Fixes: https://github.com/scylladb/scylladb/issues/22146.
Currently, progress of compaction task executors is reported in bytes.
However, if compaction_size isn't set for compaction task executor,
the executor's progress is shown as 1/1 (if it has finished) or 0/1
(otherwise).
In the following patches, the progress of executors' parent task will
be found based on its children. Hence, to avoid mixing different progress
units, the binary progress is no longer used.
Return empty progress when compaction_size isn't set.
Drop task_manager::task::impl::get_binary_progress as it's no longer
used.
Currently, progress of a parent task depends on expected_total_workload,
expected_children_number, and children progresses. Basically, if total
workload is known or all children have already been created, progresses
of children are summed up. Otherwise binary progress is returned.
As a result, two tasks of the same type may return progress in different
units. If they are children of the same task and this parent gathers the
progress - it becomes meaningless.
Drop expected_children_number as we can't assume that children are able
to show their progresses.
Modify get_progress method - progress is calculated based on children
progresses. If expected_total_workload isn't specified, the total
progress of a task may grow. If expected_total_workload isn't specified
and no children are created, empty progress (0/0) is returned.
Fixes: https://github.com/scylladb/scylladb/issues/24650.
Closesscylladb/scylladb#25113
Parent task keeps a vector of statuses (task_essentials) of its finished
children. When the children number is large - for example because we
have many tables and a child task is created for each table - we may hit
oversize allocation while adding a new child essentials to the vector.
Keep task_essentails of children in chunked_vector.
Fixes: #25040.
Closesscylladb/scylladb#25064
Convert tasks::task_manager::task::impl::release_resources() to a coroutine
to prepare for upcoming changes that will implement asynchronous resource
release.
This is a preparatory refactoring that enables future coroutine-based
implementation of resource cleanup logic.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Keep host_id of a node in task manager. If host_id wasn't resolved
yet, task manager will keep an empty id.
It's a preparation for the following changes.
Standardize on one range library to reduce dependency load.
Unfortunately, std::views::concat (the replacement for boost::join),
is C++26 only. We use two separate inserts to the result vector to
compensate, and rationalize it by saying that boost::join() is likely
slow due to the need for type-erasure.
Closesscylladb/scylladb#21834
When API user requests status of a virtual task, we first need to find
which virtual_task instance tracks given operation. While doing this we
gather some info regarding the task, but we don't utilize it.
Add virtual_task_hint that keeps info that was gathered during virtual
task lookup and pass it to virtual_task's methods so the info doesn't
need to be retrieved twice.
When users start an operation asynchronously with API, they are expected to check the operation's status. Hence, the status should be kept in task manager for reasonable time after the operation is done. The operations that are started internally usually don't need to stay in task manager for that long.
Add api_task_ttl that will be used for tasks started with API. By default it's 1 hour. The time for which non-API tasks stay in task manager isn't changed.
Fixes: #21499.
Refs: #21425.
No backport needed - previous versions may use task_ttl
Closesscylladb/scylladb#21505
* github.com:scylladb/scylladb:
test: add test to check user_task_ttl
tasks: api: move make_task method
docs: nodetool: update backup and restore commands docs
docs: update task manager docs
nodetool: add nodetool tasks user-ttl command
node_ops: use user task ttl for node ops virtual task
tasks: use user_task_ttl for tasks started by user
api: task_manager: add /task_manager/user_ttl to get and set user task ttl
tasks: add task_manager::task::is_user_task method
tasks: keep updateable_value of task_ttl in task manager
db: config: add user_task_ttl_seconds named value
Modernize the codebase by replacing Boost range adaptors with C++23 standard library views,
reducing external dependencies and leveraging modern C++ language features.
Key Changes:
- Replace `boost::adaptors::filtered` with `std::views::filter`
- Remove `#include <boost/range/adaptor/filtered.hpp>`
- Utilize standard library range views
Motivation:
- Reduce project's external dependency footprint
- Leverage standard library's range and view capabilities
- Improve long-term code maintainability
- Align with modern C++ best practices
Implementation Challenges and Considerations:
1. Range Conversion and Move Semantics
- `std::ranges::to` adaptor requires rvalue references
- Necessitated updates to variable and parameter constness
- Example: `cql3/restrictions/statement_restrictions.cc` modified to remove `const`
from `common` to enable efficient range conversion
2. Range Iteration and Mutation
- Range views may mutate internal state during iteration
- Cannot pass ranges by const reference in some scenarios
- Solution: Pass ranges by rvalue reference to explicitly indicate
state invalidation
Limitations:
- One instance of `boost::adaptors::filtered` temporarily preserved
due to lack of a C++23 alternative for `boost::join()`
- A comprehensive replacement will be addressed in a follow-up change
This change is part of our ongoing effort to modernize the codebase,
reducing external dependencies and adopting modern C++ practices.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#21648
task_manager::module::make_task method template is used only for
test_task_impl. Move it to api/task_manager_test.cc and modify it
to be test_task_impl-specific.
Add user_task_ttl_seconds config option and keep the value in task manager.
In the following patches tasks started by user will be kept in task
manager for user_task_ttl_seconds after they are finished.
Currently, lookup_virtual_task gets the list of ids of all operations
tracked by a virtual task and checks whether it contains given id.
The list of all ids isn't required and the check whether one particular
operation id is tracked by the virtual task may be quicker than listing
all operations.
Add virtual_task::contains method and use it in lookup_virtual_task.
the log.hh under the root of the tree was created keep the backward
compatibility when seastar was extracted into a separate library.
so log.hh should belong to `utils` directory, as it is based solely
on seastar, and can be used all subsystems.
in this change, we move log.hh into utils/log.hh to that it is more
modularized. and this also improves the readability, when one see
`#include "utils/log.hh"`, it is obvious that this source file
needs the logging system, instead of its own log facility -- please
note, we do have two other `log.hh` in the tree.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
This includes way too much, including <boost/regex.hpp>, which is huge.
Drop includes of adaptors.hpp and replace by what is needed.
Closesscylladb/scylladb#21187
when building scylla with the standard library from GCC-14.2, shipped by
fedora 41, we have following build failure:
```
/home/kefu/.local/bin/clang++ -DDEBUG -DDEBUG_LSA_SANITIZER -DFMT_SHARED -DSANITIZE -DSCYLLA_BUILD_MODE=debug -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_DEBUG -DSEASTAR_DEBUG_PROMISE -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_SSTRING -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"Debug\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -isystem /home/kefu/dev/scylladb/abseil -g -Og -g -gz -std=gnu++23 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb/build=. -march=x86-64-v3 -mpclmul -Xclang -fexperimental-assignment-tracking=disabled -Werror=unused-result -fstack-clash-protection -fsanitize=address -fsanitize=undefined -MD -MT CMakeFiles/scylla-main.dir/Debug/init.cc.o -MF CMakeFiles/scylla-main.dir/Debug/init.cc.o.d -o CMakeFiles/scylla-main.dir/Debug/init.cc.o -c /home/kefu/dev/scylladb/init.cc
In file included from /home/kefu/dev/scylladb/init.cc:12:
In file included from /home/kefu/dev/scylladb/db/config.hh:20:
In file included from /home/kefu/dev/scylladb/locator/abstract_replication_strategy.hh:26:
/home/kefu/dev/scylladb/locator/tablets.hh:410:30: error: unexpected type name 'size_t': expected expression
410 | return boost::irange<size_t>(0, tablet_count()) | boost::adaptors::transformed([] (size_t i) {
| ^
/home/kefu/dev/scylladb/locator/tablets.hh:410:23: error: no member named 'irange' in namespace 'boost'
410 | return boost::irange<size_t>(0, tablet_count()) | boost::adaptors::transformed([] (size_t i) {
| ~~~~~~~^
/home/kefu/dev/scylladb/locator/tablets.hh:410:38: error: left operand of comma operator has no effect [-Werror,-Wunused-value]
410 | return boost::irange<size_t>(0, tablet_count()) | boost::adaptors::transformed([] (size_t i) {
| ^
3 errors generated.
[16/782] Building CXX object CMakeFiles/scylla-main.dir/Debug/keys.cc.o
[17/782] Building CXX object CMakeFiles/scylla-main.dir/Debug/counters.cc.o
[18/782] Building CXX object CMakeFiles/scylla-main.dir/Debug/partition_slice_builder.cc.o
[19/782] Building CXX object CMakeFiles/scylla-main.dir/Debug/mutation_query.cc.o
FAILED: CMakeFiles/scylla-main.dir/Debug/mutation_query.cc.o
/home/kefu/.local/bin/clang++ -DDEBUG -DDEBUG_LSA_SANITIZER -DFMT_SHARED -DSANITIZE -DSCYLLA_BUILD_MODE=debug -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_DEBUG -DSEASTAR_DEBUG_PROMISE -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_SSTRING -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"Debug\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -isystem /home/kefu/dev/scylladb/abseil -g -Og -g -gz -std=gnu++23 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb/build=. -march=x86-64-v3 -mpclmul -Xclang -fexperimental-assignment-tracking=disabled -Werror=unused-result -fstack-clash-protection -fsanitize=address -fsanitize=undefined -MD -MT CMakeFiles/scylla-main.dir/Debug/mutation_query.cc.o -MF CMakeFiles/scylla-main.dir/Debug/mutation_query.cc.o.d -o CMakeFiles/scylla-main.dir/Debug/mutation_query.cc.o -c /home/kefu/dev/scylladb/mutation_query.cc
In file included from /home/kefu/dev/scylladb/mutation_query.cc:12:
In file included from /home/kefu/dev/scylladb/schema/schema_registry.hh:17:
In file included from /home/kefu/dev/scylladb/replica/database.hh:11:
In file included from /home/kefu/dev/scylladb/locator/abstract_replication_strategy.hh:26:
/home/kefu/dev/scylladb/locator/tablets.hh:410:30: error: unexpected type name 'size_t': expected expression
410 | return boost::irange<size_t>(0, tablet_count()) | boost::adaptors::transformed([] (size_t i) {
| ^
/home/kefu/dev/scylladb/locator/tablets.hh:410:23: error: no member named 'irange' in namespace 'boost'
410 | return boost::irange<size_t>(0, tablet_count()) | boost::adaptors::transformed([] (size_t i) {
| ~~~~~~~^
/home/kefu/dev/scylladb/locator/tablets.hh:410:38: error: left operand of comma operator has no effect [-Werror,-Wunused-value]
410 | return boost::irange<size_t>(0, tablet_count()) | boost::adaptors::transformed([] (size_t i) {
| ^
In file included from /home/kefu/dev/scylladb/mutation_query.cc:12:
In file included from /home/kefu/dev/scylladb/schema/schema_registry.hh:17:
In file included from /home/kefu/dev/scylladb/replica/database.hh:37:
In file included from /home/kefu/dev/scylladb/db/snapshot-ctl.hh:20:
/home/kefu/dev/scylladb/tasks/task_manager.hh:403:54: error: no member named 'irange' in namespace 'boost'
403 | co_await coroutine::parallel_for_each(boost::irange(0u, smp::count), [&tm, id, &res, &func] (unsigned shard) -> future<> {
| ~~~~~~~^
4 errors generated.
```
so let's take the opportunity to switch from `boost::irange` to
`std::views::iota`.
in this change, we:
- switch from boost::irange to std::views::iota for better standard library compatibility
- retain boost::irange where step parameter is used, as std::views::iota doesn't support it
- this change partially modernizes our range usage while maintaining
- existing functionality
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#20924
system_keyspace::get_topology_request_entries returns entries for
requests which are running or have finished after specified time.
In task manager node ops task set the time so that they are shown
for task_ttl seconds after they have finished.
task_manager/list_module_tasks/{module} starts supporting virtual tasks,
which means that their stats will also be shown for users.
Additional task_kind param is added to indicate whether the task is
virutal (cluster-wide) or regular (node-wide).
Support in other paths will be added in following patches.
Contrary to regular tasks, which are per-operation, virtual tasks
are associated with the whole group of operations. There may be many
operations of each group performed at the same time. Info about each
running operation will be shown to a user through the API.
For virtual tasks, task manager imitates a regular task covering
each operation, but task_manager::tasks aren't actually created
in the memory. Instead, information (e.g. status) about the operation
is retrieved from associated service and passed to a user.
To hide most of the differences from user, task_handler class is created.
Task handler performs appropriate actions depending on task's kind.
However, users need to stay conscious about the kind of task, because:
- get_task_status and wait_task do not unregister virtual tasks;
- time for which a virtual tasks stays in task manager depends
on associated service and tasks' implementation;
- number of virtual task's children shown by get_tasks doesn't have
to be monotonous.
API is modified to use task_handler.
API-specific classes are moved to task_handler.{cc,hh}.
Virtual tasks are kept in task manager together with regular tasks.
All virtual tasks are stored on shard 0.
task_manager::module::make_task is modified to consider virtual
tasks as possible parents.
A virtual task is a new kind of task supported by task manager,
which covers cluster-wide operations.
From users' perspective virtual tasks behave similarly
to task_manager::tasks. The API side of virtual tasks will be
covered in the following patches.
Contrary to task_manager::task, virtual task does not update
its fields proactively. Moreover, no object is kept in memory
for each individual virtual task's operation. Instead a service
(or services) is queried on API user's demand to learn about
the status of running operation. Hence the name.
task_manager::virtual_task is responsible for a whole group
of virtual tasks, i.e. for tracking and generating statuses
of all operations of similar type.
To enable tracking of some kind of operations, one needs to
override task_manager::virtual_task::impl and provide implementations
of the methods returning appropriate information about the operations.
task_manager::virtual_task must be kept on shard 0.
Similarly to task_manager::tasks, virtual tasks can have child tasks,
responsible for tracking suboperations' progress. But virtual tasks
cannot have parents - they are always roots in task trees.
Some methods and structs will be implemented in later patches.
this change was created in the same spirit of ebff5f5d.
despite that we include Seastar as a submodule, Seastar is not a
part of scylla project. so we'd better include its headers using
brackets.
ebff5f5d addressed this cosmetic issue a while back. but probably
clangd's header-insertion helped some of contributor to insert
the missing headers with `"`. so this style of `include` returned
to the tree with these new changes.
unfortunately, clangd does not allow us to configure the style
of `include` at the time of writing.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#19406
Currently if task_manager::task::impl::abort preempts before children
are recursively aborted and then the task gets unregistered, we hit
use after free since abort uses children vector which is no
longer alive.
Modify abort method so that it goes over all tasks in task manager
and aborts those with the given parent.
Fixes: #19304.
Currently, when a child task is unregistered, it is still kept by its parent. This leads
to excessive memory usage, especially when the tasks are configured to be kept in task
manager after they are finished (task_ttl_in_seconds).
Introduce task_essentials struct which keeps only data necesarry for task manager API.
When a task which has a parent is finished, a foreign pointer to it in its parent is replaced
with respective task_essentials. Once a parent task is finished it is also folded into
its parent (if it has one). Children details of a folded task are lost, unless they
(or some of their subtrees) failed. That is, when a task is finished, we keep:
- a root task (until it is unregistered);
- task_essentials of root's direct children;
- a path (of task_essentials) from root to each failed task (so that the reason
of a failure could be examined).
when compiling with Clang-18 + libstdc++-13, the tree fails to build:
```
/home/kefu/dev/scylladb/tasks/task_manager.hh:45:36: error: no template named 'list' in namespace 'std'
45 | using foreign_task_list = std::list<foreign_task_ptr>;
| ~~~~~^
```
so let's include the used header
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16433
If std::vector is resized its iterators and references may
get invalidated. While task_manager::task::impl::_children's
iterators are avoided throughout the code, references to its
elements are being used.
Since children vector does not need random access to its
elements, change its type to std::list<foreign_task_ptr>, which
iterators and references aren't invalidated on element insertion.
Fixes: #16380.
Closesscylladb/scylladb#16381