61 Commits

Author SHA1 Message Date
Aleksandra Martyniuk
d4fdeb4839 tasks: pass token_metadata_ptr to task_manager::virtual_task::impl::get_children
In get_children we get the vector of alive nodes with get_nodes.
Yet, between this and sending rpc to those nodes there might be
a preemption. Currently, the liveness of a node is checked once
again before the rpcs (only with gossiper not in topology - unlike
get_nodes).

Modify get_children, so that it keeps a token_metadata_ptr,
preventing topology from changing between get_nodes and rpcs.

Remove test_get_children as it checked if the get_children method
won't fail if a node is down after get_nodes - which cannot happen
currently.
2026-03-18 15:37:24 +01:00
Gleb Natapov
6173ea476b node_ops: remove topology over node ops code
The code is no longer called.
2026-02-25 10:08:32 +02:00
Tomasz Grabiec
d9e1a6006f node_ops: task_manager_module: Populate entity field also for active requests 2026-01-18 15:36:06 +01:00
Tomasz Grabiec
bbd293d440 tasks: node_ops: Put node id in the entity field
If we have many node requests active at a time, it's useful to know
which requets works on which node.

Fixes #27208
2026-01-18 15:36:06 +01:00
Tomasz Grabiec
576ebcdd30 tasks, node_ops: Unify setting of task_stats in get_status() and get_stats()
They should return the same, so extract the common logic.
2026-01-18 15:36:05 +01:00
Tomasz Grabiec
7446eb7e8d tasks, topology: Make pending node operations abortable
We want to be able to cancel decommission when it's still in the
tablet draining phase. Such a request is in a pending and paused
state, and can be safely canceled. We set the node's "draining" flag
back to false.
2026-01-18 15:36:05 +01:00
Tomasz Grabiec
71e6ef90f4 tasks, system_keyspace: Introduce get_topology_request_entry_opt()
It's a cleanup. Better to return std::nullopt than faking an entry
with an id when require_entry == false.
2025-12-16 13:25:34 +01:00
Tomasz Grabiec
902803babd node_ops: Drop get_pending_ids()
Every pending request should also have an entry in
system.topology_requests so it's redundant.

And problematic, because we cannot build a full request entry from
just an id alone, so if we would return those requests, they would
have blank information, and logic which needs more information would
not work.
2025-12-16 13:25:11 +01:00
Tomasz Grabiec
4ed17c9e88 node_ops: Drop redundant get_status_helper() 2025-12-16 11:09:11 +01:00
Radosław Cybulski
d589e68642 Add precompiled headers to CMakeLists.txt
Add precompiled header support to CMakeLists.txt and configure.py -
it improves compilation time by approximately 10%.

New header `stdafx.hh` is added, don't include it manually -
the compiler will include it for you. The header contains includes from
external libraries used by Scylla - seastar, standard library,
linux headers and zlib.

The feature is enabled by default, use CMake option `Scylla_USE_PRECOMPILED_HEADER`
or configure.py --disable-precompiled-header to disable.

The feature should be disabled, when trying to check headers - otherwise
you might get false negatives on missing includes from seastar / abseil and so on.

Note: following configuration needs to be added to ccache.conf:

    sloppiness = pch_defines,time_macros,include_file_mtime,include_file_ctime

Closes scylladb/scylladb#26617
2025-11-21 12:27:41 +02:00
Gleb Natapov
d3badf7406 storage_service: change node_ops_info::ignore_nodes to host id
It drop useless translation from id to ip during removenode through
topology coordinator.

Closes scylladb/scylladb#25958
2025-09-15 10:18:24 +02:00
Aleksandra Martyniuk
55fde70f8d api: tasks: task_manager: keep children identities in chunked_{array,vector}
task_status contains a vector of children identities. If the number
of children is large, we may hit oversized allocation.

Change all types of children-related containers to chunked_vector.
Modify the children type returned from task manager API.

Fixes: scylladb#25795.

Closes scylladb/scylladb#25923
2025-09-15 08:44:16 +03:00
Radosław Cybulski
c242234552 Revert "build: add precompiled headers to CMakeLists.txt"
This reverts commit 01bb7b629a.

Closes scylladb/scylladb#25735
2025-09-03 09:46:00 +03:00
Radosław Cybulski
01bb7b629a build: add precompiled headers to CMakeLists.txt
Add precompiled header support to CMakeLists.txt and configure.py -
it improves compilation time by approximately 10%.

New header `stdafx.hh` is added, don't include it manually -
the compiler will include it for you. The header contains includes from
external libraries used by Scylla - seastar, standard library,
linux headers and zlib.

The feature is enabled by default, use CMake option `Scylla_USE_PRECOMPILED_HEADER`
or configure.py --disable-precompiled-header to disable.

The feature should be disabled, when trying to check headers - otherwise
you might get false negatives on missing includes from seastar / abseil and so on.

Note: following configuration needs to be added to ccache.conf:

    sloppiness = pch_defines,time_macros

Closes #25182
2025-08-27 21:37:54 +03:00
Gleb Natapov
00fd427be0 topology request: make it possible to hold global request types in request_type field
topology_request table has a filed to hold a request type, but
currently it can hold only per node requests. This patch makes it
possible to store global request types there as well.
2025-06-09 13:38:49 +03:00
Aleksandra Martyniuk
53e0f79947 tasks: check whether a node is alive before rpc
Check whether a node is alive before making an rpc that gathers children
infos from the whole cluster in virtual_task::impl::get_children.
2025-04-17 12:51:22 +02:00
Kefu Chai
5be39740a8 tree: migrate from boost::find to std::ranges algorithms
Replace boost::find() calls with std::ranges::find() and std::ranges::contains()
to leverage modern C++ standard library features. This change reduces external
dependencies and modernizes the codebase.

The following changes were made:
- Replaced boost::find() with std::ranges::find() where index/iterator is needed
- Used std::ranges::contains() for simple element presence checks

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22920
2025-02-20 09:28:57 +03:00
Botond Dénes
4a7a75dfcb Merge 'tasks: use host_id in task manager' from Aleksandra Martyniuk
Use host_id in a children list of a task in task manager to indicate
a node on which the child was created.

Move TASKS_CHILDREN_REQUEST to IDL. Send it by host_id.

Fixes: https://github.com/scylladb/scylladb/issues/22284.

Ip to host_id transition; backport isn't needed.

Closes scylladb/scylladb#22487

* github.com:scylladb/scylladb:
  tasks: drop task_manager::config::broadcast_address as it's unused
  tasks: replace ip with host_id in task_identity
  api: task_manager: pass gossiper to api::set_task_manager
  tasks: keep host_id in task_manager
  tasks: move tasks_get_children to IDL
2025-02-11 11:32:27 +02:00
Avi Kivity
d4c531307d replica: database.hh: drop dependency on boost ranges
Reduces dependency load.

Closes scylladb/scylladb#22749
2025-02-10 13:29:55 +01:00
Aleksandra Martyniuk
e16b413568 tasks: replace ip with host_id in task_identity
Replace ip with host_id in task_identity. Translate host_id to ip
in task manager api handlers.

Use host_id in send_tasks_get_children.
2025-02-05 10:11:52 +01:00
Aleksandra Martyniuk
683176d3db tasks: add shard, start_time, and end_time to task_stats
task_stats contains short info about a task. To get a list of task_stats
in the module, one needs to request /task_manager/list_module_tasks/{module}.

To make identification and navigation between tasks easier, extend
task_stats to contain shard, start_time, and end_time.

Closes scylladb/scylladb#22351
2025-02-04 12:11:24 +02:00
Botond Dénes
47989b1503 Merge 'tasks: add tablet resize virtual task' from Aleksandra Martyniuk
In this change, tablet_virtual_task starts supporting tablet
resize (i.e. split and merge).

Users can see running resize tasks - finished tasks are not
presented with the task manager API.

A new task state "suspended" is added. If a resize was revoked,
it will appear to users as suspended. We assume that the resize was revoked
when the tablet number didn't change.

Fixes: #21366.
Fixes: #21367.

No backport, new feature

Closes scylladb/scylladb#21891

* github.com:scylladb/scylladb:
  test: boost: check resize_task_info in tablet_test.cc
  test: add tests to check revoked resize virtual tasks
  test: add tests to check the list of resize virtual tasks
  test: add tests to check spilt and merge virtual tasks status
  test: test_tablet_tasks: generalize functions
  replica: service: add split virtual task's children
  replica: service: pass parent info down to storage_group::split
  tasks: children of virtual tasks aren't internal by default
  tasks: initialize shard in task_info ctor
  service: extend tablet_virtual_task::abort
  service: retrun status_helper struct from tablet_virtual_task::get_status_helper
  service: extend tablet_virtual_task::wait
  tasks: add suspended task state
  service: extend tablet_virtual_task::get_status
  service: extend tablet_virtual_task::contains
  service: extend tablet_virtual_task::get_stats
  service: add service::task_manager_module::get_nodes
  tasks: add task_manager::get_nodes
  tasks: drop noexcept from module::get_nodes
  replica: service: add resize_task_info static column to system.tablets
  locator: extend tablet_task_info to cover resize tasks
2025-01-17 14:24:07 +02:00
Gleb Natapov
593308a051 node_ops, cdc: drop remaining token_metadata::get_endpoint_for_host_id() usage
Use address map to translate id to ip instead. We want to drop ips from token_metadata.
2025-01-16 16:37:07 +02:00
Aleksandra Martyniuk
3f6b932362 tasks: add task_manager::get_nodes
Move an implementation of node_ops::task_manager_module::get_nodes
to task_manager::get_nodes, so that it can be reused by other modules.
2025-01-10 10:03:07 +01:00
Aleksandra Martyniuk
5dfac9290c tasks: drop noexcept from module::get_nodes 2025-01-10 10:03:07 +01:00
Aleksandra Martyniuk
ee4bd287fd node_ops: rename a method that get node ops entries 2024-12-20 12:25:48 +01:00
Aleksandra Martyniuk
a7fc566c7e node_ops: filter topology_requests entries
Currently node_ops_virtual_task shows stats of all system.topology_request
entries. However, the table also contains info about non-node_ops requests,
e.g. truncate.

Filter the entries used by node_ops_virtual_task by their type.
With this change bootstrap of the first node will not be visible.

Update the test accordingly.
2024-12-20 12:20:42 +01:00
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Gleb Natapov
03c8ffa45c storage_service: move node_ops code to use host ids instead of host ips 2024-12-15 11:31:11 +02:00
Avi Kivity
ecd78c88bf Merge "move more verbs to idl" from Gleb
"
The series moves node ops, repair and streaming verbs to IDL. Also
contains IDL related cleanups.

In addition to the CI tested manually by bootstrapping a node with the
series into a cluster of old nodes with repair and streaming both in
gossiper and raft mode. This exercises repair, streaming and node_ops
paths.

"

* 'gleb/move-more-rpcs-to-idl-v3' of github.com:scylladb/scylla-dev:
  repair: repair_flush_hints_batchlog_request::target_nodes is not used any more, so mark it as such
  streaming: move streaming verbs to IDL
  messaging_service: move repair verbs to IDL
  node_ops: move node_ops_cmd to IDL
  idl: rename partition_checksum.dist.hh to repair.dist.hh
  idl: move node_ops related stuff from the repair related IDL
2024-12-12 17:19:43 +02:00
Gleb Natapov
5f6007f6ec node_ops: move node_ops_cmd to IDL 2024-12-09 14:50:52 +02:00
Kefu Chai
6a18db0aea node_ops: switch from boost::join() to std::ranges::join_view()
Replace boost::join() with std::ranges::join_view() as an interim solution
before C++26's std::views::concat becomes available. This change:

- Reduces dependencies on the Boost Ranges library
- Moves closer to standard library implementations
- Improves code maintainability and future compatibility

This is part of ongoing efforts to modernize our codebase and minimize
external dependencies.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21786
2024-12-09 13:46:44 +03:00
Kefu Chai
bab12e3a98 treewide: migrate from boost::adaptors::transformed to std::views::transform
now that we are allowed to use C++23. we now have the luxury of using
`std::views::transform`.

in this change, we:

- replace `boost::adaptors::transformed` with `std::views::transform`
- use `fmt::join()` when appropriate where `boost::algorithm::join()`
  is not applicable to a range view returned by `std::view::transform`.
- use `std::ranges::fold_left()` to accumulate the range returned by
  `std::view::transform`
- use `std::ranges::fold_left()` to get the maximum element in the
  range returned by `std::view::transform`
- use `std::ranges::min()` to get the minimal element in the range
  returned by `std::view::transform`
- use `std::ranges::equal()` to compare the range views returned
  by `std::view::transform`
- remove unused `#include <boost/range/adaptor/transformed.hpp>`
- use `std::ranges::subrange()` instead of `boost::make_iterator_range()`,
  to feed `std::views::transform()` a view range.

to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

limitations:

there are still a couple places where we are still using
`boost::adaptors::transformed` due to the lack of a C++23 alternative
for `boost::join()` and `boost::adaptors::uniqued`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21700
2024-12-03 09:41:32 +02:00
Aleksandra Martyniuk
898c8f4e24 tasks: utilize preliminary virtual task lookup
When API user requests status of a virtual task, we first need to find
which virtual_task instance tracks given operation. While doing this we
gather some info regarding the task, but we don't utilize it.

Add virtual_task_hint that keeps info that was gathered during virtual
task lookup and pass it to virtual_task's methods so the info doesn't
need to be retrieved twice.
2024-11-28 11:27:16 +01:00
Botond Dénes
ccb433d767 Merge 'tasks: add api_task_ttl for tasks started with API' from Aleksandra Martyniuk
When users start an operation asynchronously with API, they are expected to check the operation's status. Hence, the status should be kept in task manager for reasonable time after the operation is done. The operations that are started internally usually don't need to stay in task manager for that long.

Add api_task_ttl that will be used for tasks started with API. By default it's 1 hour. The time for which non-API tasks stay in task manager isn't changed.

Fixes: #21499.
Refs: #21425.

No backport needed - previous versions may use task_ttl

Closes scylladb/scylladb#21505

* github.com:scylladb/scylladb:
  test: add test to check user_task_ttl
  tasks: api: move make_task method
  docs: nodetool: update backup and restore commands docs
  docs: update task manager docs
  nodetool: add nodetool tasks user-ttl command
  node_ops: use user task ttl for node ops virtual task
  tasks: use user_task_ttl for tasks started by user
  api: task_manager: add /task_manager/user_ttl to get and set user task ttl
  tasks: add task_manager::task::is_user_task method
  tasks: keep updateable_value of task_ttl in task manager
  db: config: add user_task_ttl_seconds named value
2024-11-27 09:57:57 +02:00
Kefu Chai
a5ee0c896b treewide: migrate from boost::adaptors::filtered to std::views::filter
Modernize the codebase by replacing Boost range adaptors with C++23 standard library views,
reducing external dependencies and leveraging modern C++ language features.

Key Changes:
- Replace `boost::adaptors::filtered` with `std::views::filter`
- Remove `#include <boost/range/adaptor/filtered.hpp>`
- Utilize standard library range views

Motivation:
- Reduce project's external dependency footprint
- Leverage standard library's range and view capabilities
- Improve long-term code maintainability
- Align with modern C++ best practices

Implementation Challenges and Considerations:
1. Range Conversion and Move Semantics
   - `std::ranges::to` adaptor requires rvalue references
   - Necessitated updates to variable and parameter constness
   - Example: `cql3/restrictions/statement_restrictions.cc` modified to remove `const`
     from `common` to enable efficient range conversion

2. Range Iteration and Mutation
   - Range views may mutate internal state during iteration
   - Cannot pass ranges by const reference in some scenarios
   - Solution: Pass ranges by rvalue reference to explicitly indicate
     state invalidation

Limitations:
- One instance of `boost::adaptors::filtered` temporarily preserved
  due to lack of a C++23 alternative for `boost::join()`
- A comprehensive replacement will be addressed in a follow-up change

This change is part of our ongoing effort to modernize the codebase,
reducing external dependencies and adopting modern C++ practices.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21648
2024-11-26 14:26:50 +02:00
Aleksandra Martyniuk
e703ba08f8 node_ops: use user task ttl for node ops virtual task
Use user task ttl for node ops virtual task. Modify the test accordingly.
2024-11-25 14:21:53 +01:00
Avi Kivity
f5489ba4a1 locator: tablet_metadata_guard: forward declare database
No need to bring in a heavy databas.hh dependency.

Closes scylladb/scylladb#21447
2024-11-07 10:24:35 +03:00
Aleksandra Martyniuk
bc5b1f9a5d tasks: delete virtual_task::get_ids method as it is unused 2024-10-30 12:25:47 +01:00
Aleksandra Martyniuk
9b5d69ae96 tasks: improve task_manager::lookup_virtual_task
Currently, lookup_virtual_task gets the list of ids of all operations
tracked by a virtual task and checks whether it contains given id.
The list of all ids isn't required and the check whether one particular
operation id is tracked by the virtual task may be quicker than listing
all operations.

Add virtual_task::contains method and use it in lookup_virtual_task.
2024-10-30 12:24:38 +01:00
Kefu Chai
5cd619a60c treewide: s/boost::adaptors::map_keys/std::views::keys/
now that we are allowed to use C++23. we now have the luxury of using
`std::views::keys`.

in this change, we:

- replace `boost::adaptors::map_keys` with `std::views::keys`
- update affected code to work with `std::views::keys`

to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21198
2024-10-21 12:47:52 +03:00
Laszlo Ersek
934b42c6a8 cmake/check_headers: correct typos
Commit efd65aebb2 ("build: cmake: add check-header target", 2023-11-13)
introduced three typos:

- In "cmake/check_headers.cmake", it checked whether the
  "parsed_args_GLOB_RECURSE" argument was defined, but then it referenced
  the same under the wrong name "parsed_args_RECURSIVE".

- The above error masked two further typos; namely the duplicate use of
  "api" and "streaming" each, as targets. With "parsed_args_GLOB_RECURSE"
  above fixed, CMake now reports these conflicting arguments (target
  names). They should have been "node_ops" and "sstables", respectively.

Correct the typos.

Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>

Closes scylladb/scylladb#20992
2024-10-08 09:38:16 +03:00
Aleksandra Martyniuk
efc7ad8547 node_ops: fix task_manager_module::get_nodes()
Currently, node ops virtual task gathers its children from all nodes contained
in a sum of service::topology::normal_nodes and service::topology::transition_nodes.
The maps may contain nodes that are down but weren't removed yet. So, if a user
requests the status of a node ops virtual task, the task's attempt to retrieve
its children list may fail with seastar::rpc::closed_error.

Filter out the tasks that are down in node_ops::task_manager_module::get_nodes.

Fixes: #20843.

Closes scylladb/scylladb#20856
2024-09-30 12:32:23 +03:00
Aleksandra Martyniuk
3195ebd04e node_ops: make node_ops tasks type more human-friendly
Currently, node ops tasks type is retrieved from topology_request
without any change. Use respective node operation name instead.

Closes scylladb/scylladb#20671
2024-09-25 08:49:34 +03:00
Patryk Jędrzejczak
ed55261650 treewide: distinguish all nodes from all token owners
In one of the following patches, we introduce support for zero-token
nodes. From that point, getting all nodes and getting all token
owners isn't equivalent. In this patch, we ensure that we consider
only token owners when we want to consider only token owners (for
example, in the replication logic), and we consider all nodes when
we want to consider all nodes (for example, in the topology logic).

The main purpose of this patch is to make the PR introducing
zero-token nodes easier to review. The patch that introduces
zero-token nodes is already complicated. We don't want trivial
changes from this patch to make noise there.

This patch introduces changes needed for zero-token nodes only in the
Raft-based topology and in the recovery mode. Zero-token nodes are
unsupported in the gossip-based topology outside recovery.

Some functions added to `token_metadata` and `topology` are
inefficient because they compute a new data structure in every call.
They are never called in the hot path, so it's not a serious problem.
Nevertheless, we should improve it somehow. Note that it's not
obvious how to do it because we don't want to make `token_metadata`
store topology-related data. Similarly, we don't want to make
`topology` store token-related data. We can think of an improvement
in a follow-up.

We don't remove unused `topology::get_datacenter_rack_nodes` and
`topology::get_datacenter_nodes`. These function can be useful in the
future. Also, `topology::_dc_nodes` is used internally in `topology`.
2024-08-29 10:37:07 +02:00
Aleksandra Martyniuk
c64cb98bcf db: node_ops: filter topology request entries
system_keyspace::get_topology_request_entries returns entries for
requests which are running or have finished after specified time.

In task manager node ops task set the time so that they are shown
for task_ttl seconds after they have finished.
2024-07-23 13:35:02 +02:00
Aleksandra Martyniuk
36b77c0592 test: add a topology suite for testing tasks
Add topology_tasks test suite for testing task manager's node ops
tasks. Add TaskManagerClient to topology_tasks for an easy usage
of task manager rest api.

Write a test for bootstrap, replace, rebuild, decommission and remove
top level tasks using the above.
2024-07-23 13:35:01 +02:00
Aleksandra Martyniuk
a903971a74 node_ops: service: create streaming tasks
Create tasks which cover streaming part of topology changes. These
tasks are children of respective node_ops_virtual_task.
2024-07-23 13:35:01 +02:00
Aleksandra Martyniuk
b97a348361 node_ops: implement node_ops_virtual_task methods 2024-07-23 13:35:01 +02:00
Aleksandra Martyniuk
91fbfbf98a node_ops: add task manager module and node_ops_virtual_task
Add task manager node ops module and node_ops_virtual_task.

Some methods will be implemented in later patches.
2024-07-23 13:35:01 +02:00