Commit Graph

1040 Commits

Author SHA1 Message Date
Kamil Braun
101c1d50f0 Merge 'fix nodetool status to show zero-token nodes' from Abhinav Kumar Jha
In the current scenario, the nodetool status doesn’t display information regarding zero token nodes. For example, if 5 nodes are spun by the administrator, out of which, 2 nodes are zero token nodes, then nodetool status only shows information regarding the 3 non-zero token nodes.

This commit intends to fix this issue by leveraging the “/storage_service/host_id ” API  and adding appropriate logic in scylla-nodetool.cc to support zero token nodes.

A test is also added in nodetool/test_status.py to verify this logic. This test fails without this commit’s zero token node support logic, hence verifying the behavior.

This PR fixes a bug. Hence we need to backport it. Backporting needs to be done only
to 6.2 version, since earlier versions don't support zero token nodes.

Fixes: scylladb/scylladb#19849
Fixes: scylladb/scylladb#17857

Closes scylladb/scylladb#20909

* github.com:scylladb/scylladb:
  fix nodetool status to show zero-token nodes
  test: move `wait_for_first_completed` to pylib/util.py
  token_metadata: rename endpoint_to_host_id_map getter and add support for joining nodes
2024-10-28 12:19:36 +01:00
Avi Kivity
3124711fc4 Merge 'Report rows_merged in compaction_history rest api and nodetool' from Łukasz Paszkowski
Currently, running the `nodetool compactionhistory` command or using the rest api `curl -X GET --header "Accept: application/json" "http://localhost:10000/compaction_manager/compaction_history"` return compaction history without the `row_merged` field.

The series computes rows merged during compaction and provides this information to users via both the nodetool command and the rest api. The `rows_merged` field contains information on merged clustering keys across multiple sstable files. For instance, compacting two sstables of a table consisting of 7 rows where two rows are part of the both sstables, the output would have the following format: {1: 5, 2: 2}.

No backport is required. It extends the existing compaction history output.

Fixes https://github.com/scylladb/scylladb/issues/666

Closes scylladb/scylladb#20481

* github.com:scylladb/scylladb:
  test/rest_api: Add tests for compactionhistory
  nodetool: Add rows merged stats into compactionhistory output
  compaction: Update compaction history with collected histogram
  compaction: Remove const qualifier from methods creating sstable readers
  sstable_set: Add optional statistics to make_local_shard_sstable_reader
  make_combined_reader: Add optional parameter, combined_reader_statistics
  reader_selector: Extend with maximum reader count
  mutation_fragment_merger: Create histogram while consuming mutation fragment batches
2024-10-27 21:26:11 +02:00
Abhinav
72f3c95a63 token_metadata: rename endpoint_to_host_id_map getter and add support for joining nodes
Rename host_id map getter, 'get_endpoint_to_host_id_map_for_reading' to 'get_endpoint_to_host_id_map_'
Also modify the getter to return information regarding joining nodes as well.

This getter will later be used for retrieving the nodes in nodetool status, hence it needs to show all nodes,
including joining ones.

The function name suffix `_for_reading` suggests that the function was used
in some other places in the past, and indeed if we need endpoints
"for reading" then we cannot show joining endpoints. But it was confirmed
that this function is currently only used by "/storage_service/host_id" endpoint,
hence it can be modified as required.

Fixes: scylladb/scylladb#17857
2024-10-25 13:20:27 +05:30
Łukasz Paszkowski
c01a38f3cf compaction: Update compaction history with collected histogram
A new field has been added to the compaction_stats structure to hold
collected combined reader statistics. The struct is than used to update
the compaction_history table.
2024-10-22 08:15:02 +02:00
Kefu Chai
6ead5a4696 treewide: move log.hh into utils/log.hh
the log.hh under the root of the tree was created keep the backward
compatibility when seastar was extracted into a separate library.
so log.hh should belong to `utils` directory, as it is based solely
on seastar, and can be used all subsystems.

in this change, we move log.hh into utils/log.hh to that it is more
modularized. and this also improves the readability, when one see
`#include "utils/log.hh"`, it is obvious that this source file
needs the logging system, instead of its own log facility -- please
note, we do have two other `log.hh` in the tree.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-10-22 06:54:46 +03:00
Kefu Chai
5cd619a60c treewide: s/boost::adaptors::map_keys/std::views::keys/
now that we are allowed to use C++23. we now have the luxury of using
`std::views::keys`.

in this change, we:

- replace `boost::adaptors::map_keys` with `std::views::keys`
- update affected code to work with `std::views::keys`

to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21198
2024-10-21 12:47:52 +03:00
Avi Kivity
c3be2489ce treewide: drop includes of <boost/range/adaptors.hpp>
This includes way too much, including <boost/regex.hpp>, which is huge.
Drop includes of adaptors.hpp and replace by what is needed.

Closes scylladb/scylladb#21187
2024-10-20 17:17:11 +03:00
Botond Dénes
b6da82dba3 Merge 'build: build seastar as an external project' from Kefu Chai
before this change, scylla's CMake-based system consumes Seastar
library by including it directly. but this failed to address the needs
of linking against Seastar shared libraries in Debug and Dev builds, while
linking against the static libraries in other builds. because Seastar
uses `BUILD_SHARED_LIBS` CMake variable to determine if it builds
shared libraries. and we cannot assign different values to this
CMake variable based on current configure type -- CMake does not
support. see https://gitlab.kitware.com/cmake/cmake/-/issues/19467

in order to address this problem, we have a couple possible
solutions:

- to enable Seastar to build both shared and static libraries in a
  pass. without sacrificing the performance, we have to build
  all object files twice: once with -fPIC, once without. in order
  to accompolish this goal, we need to develop a machinary to
  populate the same settings to these two builds. this would
  complicate the design of Seastar's building system further.
- to build Seastar libraries twice in scylla, we could use
  the ExternalProject module to implement this. but it'd be
  complicate to extract the compile options, and link options
  previously populated by Seastar's targets with CMake --
  we would have to replicate all of them in scylla. this is
  out of the question.
- to build Seastar libraries twice before building scylla,
  and let scylla to consume them using CMake config files or
  .pc files. this is a compromise. it enables scylla to
  drive the build of Seastar libraries and to consume
  the compile options and link options. the downside is:

  * the generated compilation database (compile_commands.json)
    does not include the commands building Seastar anymore.
  * the building system of scylla does not have finer graind
    control on the building process of seastar. for instance,
    we cannot specify the build dependency to a certain seastar
    library, and just build it instead of building the whole
    seastar project.

turns out the last approach is the best one we can have
at this moment. this is also the approach used by the existing
`configure.py`.

in this change, we

- add FindSeastar.cmake to

  * detect the preconfigured Seastar builds, and
  * extract the build options from .pc files
  * expose library targets to be consumed by parent project
- add Seastar as an external project, so we can build it from
  the parent project.

  this is atypical compared to standard ExternalProject usage:
  - Seastar's build system should already be configured at this point.
  - We maintain separate project variants for each configuration type.

  Benefits of this approach:
  - Allows the parent project to consume the compile options exposed by
    .pc file. as the compile options vary from one config to another.
  - Allows application of config-specific settings
  - Enables building Seastar within the parent project's build system
  - Facilitates linking of artifacts with the external project target,
    establishing proper dependencies between them

we will update `configure.py` to merge the compilation database
of scylla and seastar.

Refs scylladb/scylladb#2717

---

this is a CMake-related change, hence no need to backport.

Closes scylladb/scylladb#21131

* github.com:scylladb/scylladb:
  build: cmake: use GENERATOR_IS_MULTI_CONFIG property to detect mult-config
  build: cmake: consume Seastar using its .pc files
  build: do not use `mode` as the index into `modes`
  build: cmake: detect and link against GnuTLS library
  build: cmake: detect and link against yaml-cpp
  build: cmake: link Seastar with Seastar::<COMPONENT>
  build: cmake: define CMake generate helper funcs in scylla
2024-10-18 09:42:59 +03:00
Botond Dénes
6811411288 Merge 'Sanitize commitlog API endpoints' from Pavel Emelyanov
Endpoints are registered next to the service they use, and the unregistration deferred action is created right after it. When registered, the service in question is passed as argument and then captured by enpoints lambdas. This makes sure that service is not used by endpoints after being stopped.

That's not so for commitlog endpoints. These are registered in several places, and /commitlog "function" is not unregistered on stop. This patch fixes some of this misbehavior, in particular:

 -  adds unregistration of commitlog API function
 -  uses sharded<database>& argument in endpoints instead of ctx.db
 -  moves some endpoints from storage_service.cc to commitlog.cc

Closes scylladb/scylladb#21053

* github.com:scylladb/scylladb:
  api: Use captured database, not the one from ctx
  api: Pass sharded<database> to commitlog endpoints registration
  api: Move commitlog-related from storage_service.cc
  api: Unset commitlog API endpoints
  api: Extract set_server_commitlog() from set_server_done()
2024-10-18 08:56:13 +03:00
Kefu Chai
b2dc261841 build: cmake: define CMake generate helper funcs in scylla
before this change, we assume that scylla's CMake script includes
Seastar's CMake script.

but we are going to consume Seastar using its .pc files or its CMake
config files instead of including it directly. more over these helper
functions are not part of Seastar's public interface.

actually the same applies to the `check_headers()` helper, which was
adapted from seastar's CheckHeaders.cmake.

so to be prepared for this change, let's define these generate helper
functions in scylla.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-10-18 08:36:52 +08:00
Botond Dénes
f93abebbb9 Merge 'Sanitize compaction manager API endpoints' from Pavel Emelyanov
Endpoints are registered next to the service they use, and the unregistration deferred action is created right after it. When registered, the service in question is passed as argument and then captured by enpoints lambdas. This makes sure that service is not used by endpoints after being stopped.

That's not quite the case for compaction manager. Its endpoints can be registered in several places, and compaction_manager "function" is not unregistered on stop. This patch fixes some of this misbehavior, in particular:
- adds unregistration of compaction_manager API function
- uses sharded<compaction_manager>& argument in endpoints instead of ctx.db.local().get_compaction_manager() chain
- moves some endpoints from storage_service.cc to compaction_manager.cc

Closes scylladb/scylladb#20962

* github.com:scylladb/scylladb:
  api: Use captured compaction_manager in get_cm_stats() helper
  api: Use captured compaction_manager in endpoints
  api: Add sharded<compaction_manager> argument to compaction_manager API reg/unreg
  api: Move some endpoints from storage_service.cc to compaction_manager.cc
  api: Unset compaction_manager endpoints
  api: Use shorter registration method for compaction_manager function
2024-10-15 10:20:58 +03:00
Pavel Emelyanov
551da72492 api: Use captured database, not the one from ctx
Continuation of the previous patch -- not commitlog-related endpoints
can use provided database reference, that was captured from main.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-10 17:53:30 +03:00
Pavel Emelyanov
74f7071db8 api: Pass sharded<database> to commitlog endpoints registration
This is to make registered enpoints with with the database without
grabbing one from ctx.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-10 17:52:56 +03:00
Pavel Emelyanov
14ab6d2615 api: Move commitlog-related from storage_service.cc
It registers itself in /storage_service function, but works with
commitlog, so should be located next to commitlog endpoints.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-10 17:52:25 +03:00
Pavel Emelyanov
ba73704774 api: Unset commitlog API endpoints
Most of other set_...()-s has the unset_...() scheduled right
afterwards, so here's one for set_server_commitlog().

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-10 17:50:37 +03:00
Pavel Emelyanov
44ec6d36f3 api: Extract set_server_commitlog() from set_server_done()
The latter collects a bunch of endpoints including commitlog ones.
Extract it as snandalone call in main. It's currently not located next
to "commitlog server" as it should, because there's no standalone
commitlog service in main. It will be addressed as a followup together
with other endpoints that work with sharded<database>.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-10 17:49:10 +03:00
Kefu Chai
cd05f61607 api/storage_service: use ranges when handlging restore API
this change is a follow up of 787ea4b1, to modernize the code base.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#20972
2024-10-07 10:54:37 +03:00
Pavel Emelyanov
6b480589fe Merge 'treewide: accept list of sstables in "restore" API ' from Kefu Chai
before this change, we enumerate the sstables tracked by the
system.sstables table, and restore them when serving
requests to "storage_service/restore" API. this works fine with
"storage_service/backup" API. but this "restore" API cannot be
used as a drop-in replacement of the rclone based API currently
used by scylla-manager.

in order to fill the gap, in this change:

* add the "prefix" parameter for specifying the shared prefix of
  sstables
* add the "sstables" parameter for specifying the list of  TOC
  components of sstables
* remove the "snapshot" parameter, as we don't encode the prefix
  on scylla's end anymore.
* make the "table" parameter mandatory.

Fixes https://github.com/scylladb/scylladb/issues/20461

----

this change is a part of the efforts to bring the native backup/restore to scylla, no need to backprt.

Closes scylladb/scylladb#20685

* github.com:scylladb/scylladb:
  treewide: accept list of sstables in "restore" API
  sstable: pass get_storage_option to sstable_directory::load_sstable()
  test/nodetool: add body parameter to `expected_request`
  tools/scylla-nodetool: enable nodetool to write HTTP body
2024-10-04 12:38:08 +03:00
Pavel Emelyanov
0f6e76f92f api: Use captured compaction_manager in get_cm_stats() helper
This is continuation of the previous patch that also need to touch the
helper function argument.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-04 12:28:15 +03:00
Pavel Emelyanov
f99d8e07ae api: Use captured compaction_manager in endpoints
Instead of getting via ctx -> database chain.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-04 12:28:15 +03:00
Pavel Emelyanov
05b4a8e710 api: Add sharded<compaction_manager> argument to compaction_manager API reg/unreg
To be used by next patch.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-04 12:28:15 +03:00
Pavel Emelyanov
58c4c21581 api: Move some endpoints from storage_service.cc to compaction_manager.cc
Those setting and getting bandiwdth need compaction manager to work with
and thus should sit next to other enpoints working with it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-04 11:36:07 +03:00
Pavel Emelyanov
43fe482204 api: Unset compaction_manager endpoints
Similarly to other .cc files, compaction manager should have its
endpoints unset. For now, no batch unsetting exists, so need to do it
one-by-one.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-04 11:35:19 +03:00
Pavel Emelyanov
aa13be15b0 api: Use shorter registration method for compaction_manager function
The register_api() helper does exatly what's needed here -- registers
function and calls a method to set routes.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-04 11:34:33 +03:00
Kamil Braun
e67016540c Merge 'Node replace and remove operations: Add deprecate IP addresses usage warning.' from Sergey Zolotukhin
- As part of deprecation of IP address usage, warning messages were added when IP addresses specified in the `ignore-dead-nodes` and `--ignore-dead-nodes-for-replace` options for scylla and nodetool.
- Slight optimizations for `utils::split_comma_separated_list`, ` host_id_or_endpoint lists` and `storage_service` remove node operations, replacing `std::list` usage with `std::vector`.

Fixes scylladb/scylladb#19218

Backport: 6.2 as it's not yet released.

Closes scylladb/scylladb#20756

* github.com:scylladb/scylladb:
  config: Add a warning about use of IP address for join topology and replace operations.
  nodetool: Add IP address usage warning for 'ignore-dead-nodes'.
  tests: Fix incorrect UUIDs in test_nodeops
  utils: Optimizations for utils::split_comma_separated_list and usage of host_id_or_endpoint lists
2024-10-03 11:08:28 +02:00
Kefu Chai
f9091066b7 treewide: replace boost::irange with std::views::iota where possible
when building scylla with the standard library from GCC-14.2, shipped by
fedora 41, we have following build failure:

```
/home/kefu/.local/bin/clang++ -DDEBUG -DDEBUG_LSA_SANITIZER -DFMT_SHARED -DSANITIZE -DSCYLLA_BUILD_MODE=debug -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_DEBUG -DSEASTAR_DEBUG_PROMISE -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_SSTRING -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"Debug\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -isystem /home/kefu/dev/scylladb/abseil -g -Og -g -gz -std=gnu++23 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb/build=. -march=x86-64-v3 -mpclmul -Xclang -fexperimental-assignment-tracking=disabled -Werror=unused-result -fstack-clash-protection -fsanitize=address -fsanitize=undefined -MD -MT CMakeFiles/scylla-main.dir/Debug/init.cc.o -MF CMakeFiles/scylla-main.dir/Debug/init.cc.o.d -o CMakeFiles/scylla-main.dir/Debug/init.cc.o -c /home/kefu/dev/scylladb/init.cc
In file included from /home/kefu/dev/scylladb/init.cc:12:
In file included from /home/kefu/dev/scylladb/db/config.hh:20:
In file included from /home/kefu/dev/scylladb/locator/abstract_replication_strategy.hh:26:
/home/kefu/dev/scylladb/locator/tablets.hh:410:30: error: unexpected type name 'size_t': expected expression
  410 |         return boost::irange<size_t>(0, tablet_count()) | boost::adaptors::transformed([] (size_t i) {
      |                              ^
/home/kefu/dev/scylladb/locator/tablets.hh:410:23: error: no member named 'irange' in namespace 'boost'
  410 |         return boost::irange<size_t>(0, tablet_count()) | boost::adaptors::transformed([] (size_t i) {
      |                ~~~~~~~^
/home/kefu/dev/scylladb/locator/tablets.hh:410:38: error: left operand of comma operator has no effect [-Werror,-Wunused-value]
  410 |         return boost::irange<size_t>(0, tablet_count()) | boost::adaptors::transformed([] (size_t i) {
      |                                      ^
3 errors generated.
[16/782] Building CXX object CMakeFiles/scylla-main.dir/Debug/keys.cc.o
[17/782] Building CXX object CMakeFiles/scylla-main.dir/Debug/counters.cc.o
[18/782] Building CXX object CMakeFiles/scylla-main.dir/Debug/partition_slice_builder.cc.o
[19/782] Building CXX object CMakeFiles/scylla-main.dir/Debug/mutation_query.cc.o
FAILED: CMakeFiles/scylla-main.dir/Debug/mutation_query.cc.o
/home/kefu/.local/bin/clang++ -DDEBUG -DDEBUG_LSA_SANITIZER -DFMT_SHARED -DSANITIZE -DSCYLLA_BUILD_MODE=debug -DSCYLLA_ENABLE_ERROR_INJECTION -DSEASTAR_API_LEVEL=7 -DSEASTAR_DEBUG -DSEASTAR_DEBUG_PROMISE -DSEASTAR_DEBUG_SHARED_PTR -DSEASTAR_DEFAULT_ALLOCATOR -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SHUFFLE_TASK_QUEUE -DSEASTAR_SSTRING -DSEASTAR_TYPE_ERASE_MORE -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"Debug\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -isystem /home/kefu/dev/scylladb/abseil -g -Og -g -gz -std=gnu++23 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb/build=. -march=x86-64-v3 -mpclmul -Xclang -fexperimental-assignment-tracking=disabled -Werror=unused-result -fstack-clash-protection -fsanitize=address -fsanitize=undefined -MD -MT CMakeFiles/scylla-main.dir/Debug/mutation_query.cc.o -MF CMakeFiles/scylla-main.dir/Debug/mutation_query.cc.o.d -o CMakeFiles/scylla-main.dir/Debug/mutation_query.cc.o -c /home/kefu/dev/scylladb/mutation_query.cc
In file included from /home/kefu/dev/scylladb/mutation_query.cc:12:
In file included from /home/kefu/dev/scylladb/schema/schema_registry.hh:17:
In file included from /home/kefu/dev/scylladb/replica/database.hh:11:
In file included from /home/kefu/dev/scylladb/locator/abstract_replication_strategy.hh:26:
/home/kefu/dev/scylladb/locator/tablets.hh:410:30: error: unexpected type name 'size_t': expected expression
  410 |         return boost::irange<size_t>(0, tablet_count()) | boost::adaptors::transformed([] (size_t i) {
      |                              ^
/home/kefu/dev/scylladb/locator/tablets.hh:410:23: error: no member named 'irange' in namespace 'boost'
  410 |         return boost::irange<size_t>(0, tablet_count()) | boost::adaptors::transformed([] (size_t i) {
      |                ~~~~~~~^
/home/kefu/dev/scylladb/locator/tablets.hh:410:38: error: left operand of comma operator has no effect [-Werror,-Wunused-value]
  410 |         return boost::irange<size_t>(0, tablet_count()) | boost::adaptors::transformed([] (size_t i) {
      |                                      ^
In file included from /home/kefu/dev/scylladb/mutation_query.cc:12:
In file included from /home/kefu/dev/scylladb/schema/schema_registry.hh:17:
In file included from /home/kefu/dev/scylladb/replica/database.hh:37:
In file included from /home/kefu/dev/scylladb/db/snapshot-ctl.hh:20:
/home/kefu/dev/scylladb/tasks/task_manager.hh:403:54: error: no member named 'irange' in namespace 'boost'
  403 |         co_await coroutine::parallel_for_each(boost::irange(0u, smp::count), [&tm, id, &res, &func] (unsigned shard) -> future<> {
      |                                               ~~~~~~~^
4 errors generated.
```

so let's take the opportunity to switch from `boost::irange` to
`std::views::iota`.

in this change, we:

- switch from boost::irange to std::views::iota for better standard library compatibility
- retain boost::irange where step parameter is used, as std::views::iota doesn't support it
- this change partially modernizes our range usage while maintaining
- existing functionality

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#20924
2024-10-03 10:33:33 +03:00
Sergey Zolotukhin
3b9033423d utils: Optimizations for utils::split_comma_separated_list and usage of host_id_or_endpoint lists
- utils::split_comma_separated_list now accepts a reference to sstring instead
  of a copy to avoid extra memory allocations. Additionally, the results of
  trimming are moved to the resulting vector instead of being copied.
- service/storage_service removenode, raft_removenode, find_raft_nodes_from_hoeps,
  parse_node_list and api/storage_service::set_storage_service were changed to use
  std::vector<host_id_or_endpoint> instead of std::list<host_id_or_endpoint> as
  std::vector is a more cache-friendly structure,  resulting in better performance.
2024-10-02 11:56:59 +02:00
Kefu Chai
787ea4b1d4 treewide: accept list of sstables in "restore" API
before this change, we enumerate the sstables tracked by the
system.sstables table, and restore them when serving
requests to "storage_service/restore" API. this works fine with
"storage_service/backup" API. but this "restore" API cannot be
used as a drop-in replacement of the rclone based API currently
used by scylla-manager.

in order to fill the gap, in this change:

* add the "prefix" parameter for specifying the shared prefix of
  sstables
* add the "sstables" parameter for specifying the list of  TOC
  components of sstables
* remove the "snapshot" parameter, as we don't encode the prefix
  on scylla's end anymore.
* make the "table" parameter mandatory.

Fixes scylladb/scylladb#20461
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-10-01 23:24:56 +08:00
Gleb Natapov' via ScyllaDB development
22368b13f2 api: introduce raft stepdown REST API
Also provide test.py util function to trigger it. Can be useful for
testing.
2024-10-01 12:18:49 +02:00
Kefu Chai
d663b6c13b treewide: add "table" parameter to "backup" API
with this parameter, "backup" API can backup the given table, this
enables it to be a drop-in replacement of existing rclone API used by
scylla manager.

in this change:

* api/storage_service: add "table" parameter to "backup" API.
* snapshot_ctl: compose the full path of the snapshot directory in
  `snapshot_ctl::start_backup`. since we have all the information
  for composing the snapshot directory, and what the `backup_task_impl`
  class is interested is but the snapshot directory, we just pass
  the path to it instead the individual components of the directory.
* backup_task_impl: instead of scan the whole keyspace recursively,
  only scan the specified snapshot directory.

Fixes scylladb/scylladb#20636
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-09-25 09:11:26 +08:00
Ernest Zaslavsky
924325fd25 treewide: add "prefix" parameter to backup API
Allow the caller to pass the prefix when performing backup and restore

Fixes scylladb/scylladb#20335

Closes scylladb/scylladb#20413
2024-09-18 08:25:00 +03:00
Kefu Chai
3e84d43f93 treewide: use seastar::format() or fmt::format() explicitly
before this change, we rely on `using namespace seastar` to use
`seastar::format()` without qualifying the `format()` with its
namespace. this works fine until we changed the parameter type
of format string `seastar::format()` from `const char*` to
`fmt::format_string<...>`. this change practically invited
`seastar::format()` to the club of `std::format()` and `fmt::format()`,
where all members accept a templated parameter as its `fmt`
parameter. and `seastar::format()` is not the best candidate anymore.
despite that argument-dependent lookup (ADT for short) favors the
function which is in the same namespace as its parameter, but
`using namespace` makes `seastar::format()` more competitive,
so both `std::format()` and `seastar::format()` are considered
as the condidates.

that is what is happening scylladb in quite a few caller sites of
`format()`, hence ADT is not able to tell which function the winner
in the name lookup:

```
/__w/scylladb/scylladb/mutation/mutation_fragment_stream_validator.cc:265:12: error: call to 'format' is ambiguous
  265 |     return format("{} ({}.{} {})", _name_view, s.ks_name(), s.cf_name(), s.id());
      |            ^~~~~~
/usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/format:4290:5: note: candidate function [with _Args = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
 4290 |     format(format_string<_Args...> __fmt, _Args&&... __args)
      |     ^
/__w/scylladb/scylladb/seastar/include/seastar/core/print.hh:143:1: note: candidate function [with A = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
  143 | format(fmt::format_string<A...> fmt, A&&... a) {
      | ^
```

in this change, we

change all `format()` to either `fmt::format()` or `seastar::format()`
with following rules:
- if the caller expects an `sstring` or `std::string_view`, change to
  `seastar::format()`
- if the caller expects an `std::string`, change to `fmt::format()`.
  because, `sstring::operator std::basic_string` would incur a deep
  copy.

we will need another change to enable scylladb to compile with the
latest seastar. namely, to pass the format string as a templated
parameter down to helper functions which format their parameters.
to miminize the scope of this change, let's include that change when
bumping up the seastar submodule. as that change will depend on
the seastar change.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-09-11 23:21:40 +03:00
Lakshmi Narayanan Sreethar
84d06a13c7 api: compaction: add consider_only_existing_data option
Added a new parameter `consider_only_existing_data` to major compaction
API endpoints. When enabled, major compaction will:

- Force-flush all tables.
- Force a new active segment in the commit log.
- Compact all existing SSTables and garbage-collect tombstones by only
  checking the SSTables being compacted. Memtables, commit logs, and
  other SSTables not part of the compaction will not be checked, as they
  will only contain newer data that arrived after the compaction
  started.

The `consider_only_existing_data` is passed down to the compaction
descriptor's `gc_check_only_compacting_sstables` option to ensure that
only the existing data is considered for garbage collection.

The option is also passed to the `maybe_flush_commitlog` method to make
sure all the tables are flushed and a new active segment is created in
the commit log.

Fixes #19728

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2024-09-05 17:25:45 +05:30
Kamil Braun
a4d1065628 api: move reload_raft_topology_state implementation inside storage_service
In later commit we'll want to access more `storage_service` internals
in the API's implementation (namely, `_abort_source`)

Also moving the implementation there allows making
`service::topology_transition()` private again (it was made public in
992f1327d3 only for this API
implementation)
2024-09-03 15:52:03 +02:00
Botond Dénes
9f9346fc59 Merge 'nodetool: tasks: add nodetool commands to track task manager tasks' from Aleksandra Martyniuk
Add nodetool commands to manage task manager tasks:
- tasks abort - aborts the task
- tasks list - lists all tasks in the module
- tasks modules - lists all modules
- tasks set-ttl - sets task ttl
- tasks status - gets status of the task
- tasks tree - gets statuses of the task and all its desendent's
- tasks ttl - gets task ttl
- tasks wait - waits for the task and gets its status

Fixes: https://github.com/scylladb/scylladb/issues/19201.

Closes scylladb/scylladb#19614

* github.com:scylladb/scylladb:
  test: nodetool: add tests for tasks commands
  nodetool: tasks: add nodetool commands to track task manager tasks
  api: task_manager: return status 403 if a task is not abortable
  api: task_manager: return none instead of empty task id
  api: task_manager: add timeout to wait_task
  api: task_manager: add operation to get ttl
  nodetool: add suboperations support
  nodetool: change operations_with_func type
  nodetool: prepare operation related classes for suboperations
2024-08-30 07:37:37 +03:00
Aleksandra Martyniuk
627fc46ca7 api: task_manager: return status 403 if a task is not abortable 2024-08-29 13:53:40 +02:00
Aleksandra Martyniuk
10ab60f32b api: task_manager: return none instead of empty task id
If a user requests a status of a task that does not have a parent,
show "none" instead of an empty parent_id.
2024-08-29 13:53:40 +02:00
Aleksandra Martyniuk
5bcff4d544 api: task_manager: add timeout to wait_task 2024-08-29 13:53:40 +02:00
Aleksandra Martyniuk
3d78172328 api: task_manager: add operation to get ttl 2024-08-29 13:53:39 +02:00
Pavel Emelyanov
11a04bfb66 code: Introduce restore API method
The method starts a task that uses sstables_loader load-and-stream
functionality to bring new sstables into the cluster. The existing
load-and-stream picks up sstables from upload/ directory, the newly
introduced task collects them from S3 bucket and given prefix (that
correspond to the path where backup API method put them).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-08-28 15:42:49 +03:00
Avi Kivity
72a85e3812 Merge 'Integrated backup' from Pavel Emelyanov
This adds minimal implementation of the start-backup API call.

The method starts a task that uploads all files from the given keyspace's snapshot to the requested endpoint/bucket. Arguments are:
- endpoint -- the ID in object_store.yaml config file
- bucket -- the target bucket to put objects into
- keyspace -- the keyspace to work on
- snapshot -- the method assumes that the snapshot had been already taken and only copies sstables from it

The task runs in the background, its task_id is returned from the method once it's spawned and it should be used via /task_manager API to track the task execution and completion (hint: it's good to have non-zero TTL value to make sure fast backups don't finish before the caller manages to call wait_task API).

Sstables components are scanned for all tables in the keyspace and are uploaded into the /bucket/${cf_name}/${snapshot_name}/ path.

refs: #18391

Closes scylladb/scylladb#19890

* github.com:scylladb/scylladb:
  tools/scylla-nodetool: add backup integration
  docs: Document the new backup method
  test/object_store: Test that backup task is abortable
  test/object_store: Add simple backup test
  test/object_store: Move format_tuples()
  test/pylib: Add more methods to rest client
  backup-task: Make it abortable (almost)
  code: Introduce backup API method
  database: Export parse_table_directory_name() helper
  database: Introduce format_table_directory_name() helper
  snapshot-ctl: Add config to snapshot_ctl
  snapshot-ctl: Add sstables::storage_manager dependency
  snapshot-ctl: Maintain task manager module
  snapshot-ctl: Add "snapshots" logger
  snapshot-ctl: Outline stop() method and constructor
  snapshot-ctl: Inline run_snapshot_list<>
  test/cql_test_env: Export task manager from cql test env
  task_manager: Print task ttl on start (for debugging)
  docs: Update object_storage.md with AWS_ environment
  docs: Restructure object_storage.md
2024-08-25 20:19:10 +03:00
Pavel Emelyanov
d1ac58f088 api: Get compaction througput via compaction manager
Now the endpoint hanler gets the value from db::config which is not nice
from several perspectives. First, it gets config (ab)using database.
Second, it's compaction manager that "knows" its throughput, global
config is the initial source of that information.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#20173
2024-08-23 10:33:03 +03:00
Pavel Emelyanov
a812f13ddd code: Introduce backup API method
The method starts a task that uploads all files from the given
keyspace's snapshot to the requested endpoint/bucket. The task runs in
the background, its task_id is returned from the method once it's
spawned and it should be used via /task_manager API to track the task
execution and completion (hint: it's good to have non-zero TTL value to
make sure fast backups don't finish before the caller manages to call
wait_task API).

If snapshot doesn't exist, nothing happens (FIXME, need to return back
an error in that case).

If endpoint is not configured locally, the API call resolves with
bad-request instantly.

Sstables components are scanned for all tables in the keyspace and are
uploaded into the /bucket/${cf_name}/${snapshot_name}/ path.

Task is not abortable (FIXME -- to be added) and doesn't really report
its progress other than running/done state (FIXME -- to be added too).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-08-22 19:47:06 +03:00
Benny Halevy
0419b1d522 nodetool: rebuild: add force option
To be used to force usage of source_dc, even
when it is unsafe for rebuild.

Update docs and add test/nodetool/test_rebuild.py

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-08-19 17:20:12 +03:00
Benny Halevy
8b1877f3ca Add and use utils::optional_param to pass source_dc
Clearly indicate if a source_dc is provided,
and if so, was it explicitly given by the user,
or was implicitly selected by scylla.

This will become useful in the next patches
that will use that to either reject the operation
if it's unsafe to use the source_dc and the dc was
explicitly given by the user, or whether
to fallback to using all nodes otherwise.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2024-08-19 17:13:54 +03:00
Michał Jadwiszczak
870bdaa6b1 api/cql_server_test: add CQL server testing API
Add a CQL server testing API with and endpoint to dump
service level parameters of all CQL connections.

This endpoint will be later used to test functionality of
automated updating CQL connections parameters.
2024-08-08 10:42:09 +02:00
Pavel Emelyanov
2fd60b0adc api: Move config-related endpoints from storage_service.cc
The get_all_data_file_locations and get_saved_caches_location get the
returned data from db::config and should be next other endpoints working
with config data.

refs: #2737

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#19958
2024-08-07 10:18:29 +03:00
Avi Kivity
aa1270a00c treewide: change assert() to SCYLLA_ASSERT()
assert() is traditionally disabled in release builds, but not in
scylladb. This hasn't caused problems so far, but the latest abseil
release includes a commit [1] that causes a 1000 insn/op regression when
NDEBUG is not defined.

Clearly, we must move towards a build system where NDEBUG is defined in
release builds. But we can't just define it blindly without vetting
all the assert() calls, as some were written with the expectation that
they are enabled in release mode.

To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT()
macro in utils/assert.hh. This macro is always defined and is not conditional
on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release
mode.

[1] 66ef711d68

Closes scylladb/scylladb#20006
2024-08-05 08:23:35 +03:00
Pavel Emelyanov
456dbc122b api: Unset cache_service endpoints on stop
They currently stay registered long after the dependent services get
stopped. There's a need for batch unsetting (scylladb/seastar#1620), so
currently only this explicit listing :(

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-07-24 18:51:32 +03:00
Pavel Emelyanov
6ae09cc6bf api: Register token-metadata API next to token-metadata itsels
Right now API registration happens quite late because it waits storage
service to register its "function" first. This can be done beforeheand
and the t.m. API can be moved to where it should be.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-07-24 18:51:32 +03:00