Commit Graph

1080 Commits

Author SHA1 Message Date
Piotr Dulikowski
49f5fc0e70 api: introduce service levels specific API
Introduces two endpoints with operations specific to service levels:

- switch_tenants: updates the scheduling group of all connections to be
  aligned with the service level specific to the logged in user. This is
  mostly legacy API, as with service levels on raft this is done
  automatically.
- count_connections: for each user and for each scheduling group, counts
  how many connections are assigned to that user and scheduling group.
  This API is used in tests.
2025-01-02 07:13:34 +01:00
Piotr Dulikowski
a65c0c3735 api/cql_server_test: add information about scheduling group
Now, information about connections' scheduling group is included in the
HTTP API for querying information about connections' parameters.
2025-01-02 07:13:34 +01:00
Avi Kivity
3ffe93b6ae Merge 'Enhance load-and-stream with "scope"' from Pavel Emelyanov
The main purpose of this change is to enhance the restore from object storage usage.

Currently, restore uses the load-and-stream facility. When triggered, the restoring task opens the provided list of sstables directory from the remote bucket and then feeds the list of sstables to load_and_stream() method. The method, in turn, iterates over this list, reads mutations and for each mutation decides where to send one by checking the replication map (it's pretty much the same for both vnodes and tablets, but for tablets that are "fully contained" by a range there's the plan to stream faster).

As described above, restore is governed by a single node and this single node reads all sstables from the object store, which can be very slow. This PR allows speeding things up. For that, the load-and-stream code is equipped with the "scope" filter which limits where mutations can be streamed to. There are four options for that -- all, dc, rack and node. The "all" is how things work currently, "dc" and "rack" filter out target nodes that don't belong to this node's dc/rack respectively. The "node" scope only streams mutations to local node.

With the "node" scope it's possible to make all nodes in the cluster load mutations that belong to them in parallel, without re-sending them to peers. The last patch in this PR is the test that shows how it can be possible.

Closes scylladb/scylladb#21169

* github.com:scylladb/scylladb:
  test: Add scope-streaming test (for restore from backup)
  api: New "scope" API param to load-and-stream calls
  sstables_loader: Propagate scope from API down
  sstables_loader: Filter tablets based on scope
  streamer: Disable scoped streaming of primary replica only
  sstables_loader: Introduce streaming scope
  sstables_loader: Wrap get_endpoints()
2024-12-25 13:52:51 +02:00
Pavel Emelyanov
5eb3278d9e api: Use built_views table in get_built_indexes API
Somehow system."IndexInfo" table and column_family/built_indexes REST
API endpoint declare an index "built" at slightly different times:

The former a virtual table which declares an index completely built
when it appears on the system.built_views table.

The latter uses different data -- it takes the list of indexes in
the schema and eliminates indexes which are still listed in the
system.scylla_views_builds_in_progress table.

The mentioned system. tables are updated at different times, so API
notices the change a bit later. It's worth improving the consistency
of these two APIs by making the REST API endpoint piggy-back the
load_built_views() instead of load_view_build_progress(). With that
change the filtering of indexes should be negated.

Fixes #21587

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-24 16:18:00 +03:00
Pavel Emelyanov
a24dc02255 api: New "scope" API param to load-and-stream calls
There are two of those -- the POST /storage_service/keyspace that loads
and streams new sstables from /upload and POST /storage_service/restore
that does the same, but gets sstables from object store.

The new optional parameter allow users to tun the streaming phase
behavior. The test/pylib client part is also updated here.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-23 19:28:05 +03:00
Pavel Emelyanov
960041d4b4 sstables_loader: Propagate scope from API down
Semi-mechanical change that adds newly introduced "scope" parameter to
all the functions between API methods and the low-level streamer object.
No real functional changes. API methods set it to "all" to keep existing
behavior.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-12-23 19:28:05 +03:00
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Avi Kivity
9024e4940c counters.hh: drop unused boost includes
Re-add them to source files that need them.

Closes scylladb/scylladb#21738
2024-12-05 12:27:41 +02:00
Avi Kivity
841481c202 Merge "move storage proxy and adjacent services to identify hosts by ids" from Gleb
"
This rather large patch series moves storage proxy and some adjacent
services (like migration manager) to use host ids to identify nodes rather
than ips. Messaging service gains a capability to address nodes by host
ids (which allows dropping translations from topology coordinator code
that worked on host ids already) and also makes sure that a node with
incorrect host id will reject a message (can happen during address
changes).

The series gets rid of the raft address map completely and replaces it with
the gossiper address map which is managed by the gossiper since translation
is now done in the layer below raft.

Fixes: scylladb/scylladb#6403

perf-simple-query -- smp 1 -m 1G output

Before:

enable-cache=1
Running test with config: {partitions=10000, concurrency=100, mode=read, frontend=cql, query_single_key=no, counters=no}
Disabling auto compaction
Creating 10000 partitions...
64336.82 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   41291 insns/op,   24485 cycles/op,        0 errors)
62669.58 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   41277 insns/op,   24695 cycles/op,        0 errors)
69172.12 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.2 tasks/op,   41326 insns/op,   24463 cycles/op,        0 errors)
56706.60 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   41143 insns/op,   24513 cycles/op,        0 errors)
56416.65 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   41186 insns/op,   24851 cycles/op,        0 errors)

         throughput: mean=61860.35 standard-deviation=5395.48 median=62669.58 median-absolute-deviation=5153.75 maximum=69172.12 minimum=56416.65
instructions_per_op: mean=41244.62 standard-deviation=76.90 median=41276.94 median-absolute-deviation=58.55 maximum=41326.19 minimum=41142.80
  cpu_cycles_per_op: mean=24601.35 standard-deviation=167.39 median=24512.64 median-absolute-deviation=116.65 maximum=24851.45 minimum=24462.70

After:

enable-cache=1
Running test with config: {partitions=10000, concurrency=100, mode=read, frontend=cql, query_single_key=no, counters=no}
Disabling auto compaction
Creating 10000 partitions...
65237.35 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.2 tasks/op,   40733 insns/op,   23145 cycles/op,        0 errors)
59283.09 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   40624 insns/op,   23948 cycles/op,        0 errors)
70851.03 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   40625 insns/op,   23027 cycles/op,        0 errors)
70549.61 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   40650 insns/op,   23266 cycles/op,        0 errors)
68634.96 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   40622 insns/op,   22935 cycles/op,        0 errors)

         throughput: mean=66911.21 standard-deviation=4814.60 median=68634.96 median-absolute-deviation=3638.40 maximum=70851.03 minimum=59283.09
instructions_per_op: mean=40650.89 standard-deviation=47.55 median=40624.60 median-absolute-deviation=27.11 maximum=40733.37 minimum=40622.33
  cpu_cycles_per_op: mean=23264.16 standard-deviation=402.12 median=23145.29 median-absolute-deviation=237.63 maximum=23947.96 minimum=22934.59

CI: https://jenkins.scylladb.com/job/scylla-master/job/scylla-ci/13531/
SCT (longevity-100gb-4h with nemesis_selector: ['topology_changes']): https://jenkins.scylladb.com/view/staging/job/scylla-staging/job/gleb/job/move-to-host-id/3/

Tested mixed cluster manually.
"

* 'gleb/move-to-host-id-v2' of github.com:scylladb/scylla-dev: (55 commits)
  group0: drop unused field from replace_info struct
  test: rename raft_address_map_test to address_map_test and move if from raft tests
  raft_address_map: remove raft address map
  topology coordinator: do not modify expire state for left/new nodes any more in raft address map
  topology coordinator: drop expiring entries in gossiper address map on error injections since raft one is no longer used
  group0: drop raft address map dependency from raft_rpc
  group0: move raft_ticker_type definition from raft_address_map.hh
  storage_service: do not update raft address map on gossiper events
  group0: drop raft address map dependency from raft_server_with_timeouts
  group0: move group0 upgrade code to host ids
  repair: drop raft address map dependency
  group0: remove unused raft address map getter from raft_group0
  group0: drop raft address map from group0_state_machine dependency since it is not used there any more
  group0: remove dependency on raft address map from group0_state_id_handler
  gossiper: add get_application_state_ptr that searches by host_id
  gossiper: change get_live_token_owners to return host ids
  view: move view building to host id
  hints: use host id to send hints
  storage_proxy: remove id_vector_to_addr since it is no longer used
  db: consistency_level: change is_sufficient_live_nodes to work on host ids
  ...
2024-12-03 18:18:48 +02:00
Kefu Chai
bab12e3a98 treewide: migrate from boost::adaptors::transformed to std::views::transform
now that we are allowed to use C++23. we now have the luxury of using
`std::views::transform`.

in this change, we:

- replace `boost::adaptors::transformed` with `std::views::transform`
- use `fmt::join()` when appropriate where `boost::algorithm::join()`
  is not applicable to a range view returned by `std::view::transform`.
- use `std::ranges::fold_left()` to accumulate the range returned by
  `std::view::transform`
- use `std::ranges::fold_left()` to get the maximum element in the
  range returned by `std::view::transform`
- use `std::ranges::min()` to get the minimal element in the range
  returned by `std::view::transform`
- use `std::ranges::equal()` to compare the range views returned
  by `std::view::transform`
- remove unused `#include <boost/range/adaptor/transformed.hpp>`
- use `std::ranges::subrange()` instead of `boost::make_iterator_range()`,
  to feed `std::views::transform()` a view range.

to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

limitations:

there are still a couple places where we are still using
`boost::adaptors::transformed` due to the lack of a C++23 alternative
for `boost::join()` and `boost::adaptors::uniqued`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21700
2024-12-03 09:41:32 +02:00
Botond Dénes
b87fb94a5e Merge 'tasks: add tablet repair virtual task' from Aleksandra Martyniuk
Add tablet task manager module and keep it in storage_service.
Introduce tablet_virtual_task that covers tablet repair.

Thanks to a repair virtual task, a user can check the list of pending
repairs, get the status of a specific repair, or abort it using the task
manager API.

Fixes: #21368.

No backport, new feature

Closes scylladb/scylladb#21624

* github.com:scylladb/scylladb:
  test: add test to check tablet repair tasks
  test: topology_tasks: enable tablets
  service: keep tablets module in storage_service
  service: rename storage_service::_task_manager_module
  service: add tablet_virtual_task
  tasks: utilize preliminary virtual task lookup
2024-12-02 17:22:44 +02:00
Gleb Natapov
96309224ff raft_address_map: remove raft address map
It is no longer used.
2024-12-02 10:31:14 +02:00
Gleb Natapov
4ddb925997 repair: drop raft address map dependency
Replace it with gossiper address map, but make dependency localized.
Only functions that actually use address map get it now.
2024-12-02 10:31:13 +02:00
Kefu Chai
2c9c654798 build: cmake: Enforce explicit library linkage visibility
This change improves dependency management by explicitly specifying
library linkage visibility in CMake targets.

Previously, some ScyllaDB targets used `target_link_libraries()`
without `PUBLIC` or `PRIVATE` keywords, which resulted in transitive
library dependencies by default. This unintentionally exposed
non-public dependencies to downstream targets.

Changes:
- Always use explicit `PRIVATE` or `PUBLIC` keywords with
  `target_link_libraries()`
- Tighten build dependency tree
- Enforce a more modular linkage model

See: [CMake documentation on library dependencies](https://cmake.org/cmake/help/latest/command/target_link_libraries.html)

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21686
2024-11-28 18:15:23 +02:00
Aleksandra Martyniuk
898c8f4e24 tasks: utilize preliminary virtual task lookup
When API user requests status of a virtual task, we first need to find
which virtual_task instance tracks given operation. While doing this we
gather some info regarding the task, but we don't utilize it.

Add virtual_task_hint that keeps info that was gathered during virtual
task lookup and pass it to virtual_task's methods so the info doesn't
need to be retrieved twice.
2024-11-28 11:27:16 +01:00
Botond Dénes
ccb433d767 Merge 'tasks: add api_task_ttl for tasks started with API' from Aleksandra Martyniuk
When users start an operation asynchronously with API, they are expected to check the operation's status. Hence, the status should be kept in task manager for reasonable time after the operation is done. The operations that are started internally usually don't need to stay in task manager for that long.

Add api_task_ttl that will be used for tasks started with API. By default it's 1 hour. The time for which non-API tasks stay in task manager isn't changed.

Fixes: #21499.
Refs: #21425.

No backport needed - previous versions may use task_ttl

Closes scylladb/scylladb#21505

* github.com:scylladb/scylladb:
  test: add test to check user_task_ttl
  tasks: api: move make_task method
  docs: nodetool: update backup and restore commands docs
  docs: update task manager docs
  nodetool: add nodetool tasks user-ttl command
  node_ops: use user task ttl for node ops virtual task
  tasks: use user_task_ttl for tasks started by user
  api: task_manager: add /task_manager/user_ttl to get and set user task ttl
  tasks: add task_manager::task::is_user_task method
  tasks: keep updateable_value of task_ttl in task manager
  db: config: add user_task_ttl_seconds named value
2024-11-27 09:57:57 +02:00
Ernest Zaslavsky
793f2c95d1 snapshots: Stop taking snapshots of MVs
Stop taking snapshots of MVs and allow taking snapshot of individual tables, now one can take a snapshot of any base table, any view or index. Also add tests to cover new cases both boost test (using cc code) and pytest (using the API)
Also, update documentation to reflect the change

fixes: #21339
fixes: #20760

Closes scylladb/scylladb#21433
2024-11-26 15:27:30 +02:00
Kefu Chai
a5ee0c896b treewide: migrate from boost::adaptors::filtered to std::views::filter
Modernize the codebase by replacing Boost range adaptors with C++23 standard library views,
reducing external dependencies and leveraging modern C++ language features.

Key Changes:
- Replace `boost::adaptors::filtered` with `std::views::filter`
- Remove `#include <boost/range/adaptor/filtered.hpp>`
- Utilize standard library range views

Motivation:
- Reduce project's external dependency footprint
- Leverage standard library's range and view capabilities
- Improve long-term code maintainability
- Align with modern C++ best practices

Implementation Challenges and Considerations:
1. Range Conversion and Move Semantics
   - `std::ranges::to` adaptor requires rvalue references
   - Necessitated updates to variable and parameter constness
   - Example: `cql3/restrictions/statement_restrictions.cc` modified to remove `const`
     from `common` to enable efficient range conversion

2. Range Iteration and Mutation
   - Range views may mutate internal state during iteration
   - Cannot pass ranges by const reference in some scenarios
   - Solution: Pass ranges by rvalue reference to explicitly indicate
     state invalidation

Limitations:
- One instance of `boost::adaptors::filtered` temporarily preserved
  due to lack of a C++23 alternative for `boost::join()`
- A comprehensive replacement will be addressed in a follow-up change

This change is part of our ongoing effort to modernize the codebase,
reducing external dependencies and adopting modern C++ practices.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21648
2024-11-26 14:26:50 +02:00
Aleksandra Martyniuk
ac6a07117a test: add test to check user_task_ttl 2024-11-26 09:57:42 +01:00
Aleksandra Martyniuk
1712c93261 tasks: api: move make_task method
task_manager::module::make_task method template is used only for
test_task_impl. Move it to api/task_manager_test.cc and modify it
to be test_task_impl-specific.
2024-11-26 09:57:42 +01:00
Aleksandra Martyniuk
19a90e3697 api: task_manager: add /task_manager/user_ttl to get and set user task ttl 2024-11-25 14:21:53 +01:00
Aleksandra Martyniuk
292d00463a tasks: add task_manager::task::is_user_task method 2024-11-25 14:21:53 +01:00
Asias He
844129227e repair: Add restful API for tablet repair
It allows user to add and del a tablet repair request. The request is
executed by the tablet repair scheduler.
2024-11-20 09:42:41 +08:00
Pavel Emelyanov
b158ca7346 api: Remove param field from req_param
The req_param class is used to help parsing http request parameters from
strings into exact types (typically some simple types like strings,
integrals or boolean). On it there are three fields:

- name -- the parameter name
- param -- the parameter string value
- value -- the parameter value of desired type

The `param` thing is not really needed, it's only used by few places
that print it into logs, but they may as well just print the `value`
thing itself.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#21502
2024-11-11 17:47:55 +02:00
Pavel Emelyanov
87ec2af6f0 api: Remove dead if-branch that collects all tables from ks
After calling api::parse_tables() the resulting vector of table names
cannot be empty, because in case parameter is missing, the parse_tables
function returns all tables from keyspace anyway.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#21501
2024-11-11 17:46:38 +02:00
Piotr Dulikowski
7021efd6b0 Merge 'main,cql_test_env: start group0_service before view_builder' from Michał Jadwiszczak
In scylladb/scylladb#19745, view_builder was migrated to group0 and since then it is dependant on group0_service.
Because of this, group0_service should be initialized/destroyed before/after view_builder.

This patch also adds error injection to `raft_server_with_timeouts::read_barrier`, which does 1s sleep before doing the read barrier. There is a new test which reproduces the use after free bug using the error injection.

Fixes scylladb/scylladb#20772

scylladb/scylladb#19745 is present in 6.2, so this fix should be backported to it.

Closes scylladb/scylladb#21471

* github.com:scylladb/scylladb:
  test/boost/secondary_index_test: add test for use after free
  api/raft: use `get_server_with_timeouts().read_barrier()` in coroutines
  main,cql_test_env: start group0_service before view_builder
2024-11-08 20:27:09 +01:00
Michał Jadwiszczak
de7b58e8d4 api/raft: use get_server_with_timeouts().read_barrier() in coroutines
It is unsafe to do `get_server_with_timeouts().read_barrier()` in
continuations because `get_server_with_timeouts()` returns
raft server by value and it may be deallocated when `read_barrier()` yields,
causing use-after-return.

Simple workaround is to use the read barrier in coroutine and co_await
it. Then the raft server is kept on stack until the read barrier is
finished.

I've checked all codebase and it looks like the only place where
`group0_with_timeouts().read_barrier()` is in continuation, is
api/raft.cc.

Co-authored-by: Piotr Dulikowski <piodul@scylladb.com>
2024-11-08 14:15:13 +01:00
Avi Kivity
f5489ba4a1 locator: tablet_metadata_guard: forward declare database
No need to bring in a heavy databas.hh dependency.

Closes scylladb/scylladb#21447
2024-11-07 10:24:35 +03:00
Kefu Chai
ba021f72a6 api: s/mulformatted/malformatted
mulformatted was a typo, let's fix it.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21442
2024-11-07 10:07:11 +03:00
Avi Kivity
2531dc2d80 schema_registry: stop including replica/database.hh
database.hh is a hotspot that changes often (or its dependencies
do). Avoid including it to reduce recompilations.

Closes scylladb/scylladb#21407
2024-11-04 13:16:27 +01:00
Avi Kivity
704ea9d3b4 Merge 'api: Remove foreach_column_family() helper' from Pavel Emelyanov
There's a whole lot of helpers and wrappers in api/ that help handlers manipulate keyspaces and tables. One of those is foreach_column_family which calls the provided callable on a table on each shard. There's exactly the same (but a bit more flexible) helper nearby. While at it, this helper gets a better name.

Closes scylladb/scylladb#21398

* github.com:scylladb/scylladb:
  api: Rename set_tables -> for_tables_on_all_shards
  api: Remove foreach_column_family() helper
2024-11-03 15:46:27 +02:00
Pavel Emelyanov
d6169630a4 api: Rename set_tables -> for_tables_on_all_shards
The former name is not extremely descriptive, hopefully the latter one
is better in this sense.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-11-01 12:15:01 +03:00
Pavel Emelyanov
822758dffd api: Remove foreach_column_family() helper
There's a whole lot of helpers and wrappers in api/ that help handlers
manipulate keyspaces and tables. One of those is foreach_column_family
which calls the provided callable on a table on each shard. There's
exactly the same (but a bit more flexible) set_table() helper nearby.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-11-01 12:13:35 +03:00
Kefu Chai
64122b3df3 treewide: s/boost::transform/std::ranges::transform/
now that we are allowed to use C++23. we now have the luxury of using
`std::ranges::transform`.

in this change, we:

- replace `boost::transform` with `std::ranges::transform`
- update affected code to work with `std::ranges::transform`

to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21318
2024-11-01 08:15:14 +02:00
Avi Kivity
907da210b6 compound_compat: replace use of boost ranges with std ranges
To reduce the dependency load, replace use of boost ranges
with the std equivalent.

Files that lost the indirect boost dependency have it added as a
direct dependency.
2024-10-30 19:58:07 +02:00
Kefu Chai
54d438168a build: cmake: explicitly mark convenience libraries as STATIC
before this change, these
[convenience libraries](https://www.gnu.org/software/automake/manual/html_node/Libtool-Convenience-Libraries.html)
were implicitly built as static libraries by default,
but weren't explicitly marked as STATIC in CMake. While this worked
with default settings, it could cause issues if `BUILD_SHARED_LIBS` is
enabled.

So before we are ready for building these components as shared
libraries, let's mark all convenience libraries as STATIC for
consistency and to prevent potential issues before we properly support
shared library builds.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21274
2024-10-29 10:22:19 +01:00
Avi Kivity
49d3e281d6 Merge 'Sanitize /system/highest_supported_sstable_version API endpoint' from Pavel Emelyanov
Its handler dereferences long chain of objects to get to the value it needs. There's shorter way.
Also, the endpoint in question is not unregistered on stop.

Closes scylladb/scylladb#21279

* github.com:scylladb/scylladb:
  api: Make get_highest_supported_sstable_version use proper service
  api: Move system::get_highest_supported_sstable_version set/unset
  api: Scaffold for sstables-format-selector
2024-10-28 21:42:41 +02:00
Kamil Braun
101c1d50f0 Merge 'fix nodetool status to show zero-token nodes' from Abhinav Kumar Jha
In the current scenario, the nodetool status doesn’t display information regarding zero token nodes. For example, if 5 nodes are spun by the administrator, out of which, 2 nodes are zero token nodes, then nodetool status only shows information regarding the 3 non-zero token nodes.

This commit intends to fix this issue by leveraging the “/storage_service/host_id ” API  and adding appropriate logic in scylla-nodetool.cc to support zero token nodes.

A test is also added in nodetool/test_status.py to verify this logic. This test fails without this commit’s zero token node support logic, hence verifying the behavior.

This PR fixes a bug. Hence we need to backport it. Backporting needs to be done only
to 6.2 version, since earlier versions don't support zero token nodes.

Fixes: scylladb/scylladb#19849
Fixes: scylladb/scylladb#17857

Closes scylladb/scylladb#20909

* github.com:scylladb/scylladb:
  fix nodetool status to show zero-token nodes
  test: move `wait_for_first_completed` to pylib/util.py
  token_metadata: rename endpoint_to_host_id_map getter and add support for joining nodes
2024-10-28 12:19:36 +01:00
Pavel Emelyanov
420baf5035 api: Make get_highest_supported_sstable_version use proper service
This endpoint now grabs one via database -> table -> sstables manager
chain, but there's shorter route, namely via sstables format selector.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-28 10:18:57 +03:00
Pavel Emelyanov
61c8b571e5 api: Move system::get_highest_supported_sstable_version set/unset
It's currently registered with all other system endpoints and is not
unregistered. Its correct place is in the sstables-format-selector
set/unset functions.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-28 10:18:23 +03:00
Pavel Emelyanov
f090bdabbb api: Scaffold for sstables-format-selector
This "service" will have its own endpoint soon

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-10-28 10:17:38 +03:00
Avi Kivity
3124711fc4 Merge 'Report rows_merged in compaction_history rest api and nodetool' from Łukasz Paszkowski
Currently, running the `nodetool compactionhistory` command or using the rest api `curl -X GET --header "Accept: application/json" "http://localhost:10000/compaction_manager/compaction_history"` return compaction history without the `row_merged` field.

The series computes rows merged during compaction and provides this information to users via both the nodetool command and the rest api. The `rows_merged` field contains information on merged clustering keys across multiple sstable files. For instance, compacting two sstables of a table consisting of 7 rows where two rows are part of the both sstables, the output would have the following format: {1: 5, 2: 2}.

No backport is required. It extends the existing compaction history output.

Fixes https://github.com/scylladb/scylladb/issues/666

Closes scylladb/scylladb#20481

* github.com:scylladb/scylladb:
  test/rest_api: Add tests for compactionhistory
  nodetool: Add rows merged stats into compactionhistory output
  compaction: Update compaction history with collected histogram
  compaction: Remove const qualifier from methods creating sstable readers
  sstable_set: Add optional statistics to make_local_shard_sstable_reader
  make_combined_reader: Add optional parameter, combined_reader_statistics
  reader_selector: Extend with maximum reader count
  mutation_fragment_merger: Create histogram while consuming mutation fragment batches
2024-10-27 21:26:11 +02:00
Abhinav
72f3c95a63 token_metadata: rename endpoint_to_host_id_map getter and add support for joining nodes
Rename host_id map getter, 'get_endpoint_to_host_id_map_for_reading' to 'get_endpoint_to_host_id_map_'
Also modify the getter to return information regarding joining nodes as well.

This getter will later be used for retrieving the nodes in nodetool status, hence it needs to show all nodes,
including joining ones.

The function name suffix `_for_reading` suggests that the function was used
in some other places in the past, and indeed if we need endpoints
"for reading" then we cannot show joining endpoints. But it was confirmed
that this function is currently only used by "/storage_service/host_id" endpoint,
hence it can be modified as required.

Fixes: scylladb/scylladb#17857
2024-10-25 13:20:27 +05:30
Łukasz Paszkowski
c01a38f3cf compaction: Update compaction history with collected histogram
A new field has been added to the compaction_stats structure to hold
collected combined reader statistics. The struct is than used to update
the compaction_history table.
2024-10-22 08:15:02 +02:00
Kefu Chai
6ead5a4696 treewide: move log.hh into utils/log.hh
the log.hh under the root of the tree was created keep the backward
compatibility when seastar was extracted into a separate library.
so log.hh should belong to `utils` directory, as it is based solely
on seastar, and can be used all subsystems.

in this change, we move log.hh into utils/log.hh to that it is more
modularized. and this also improves the readability, when one see
`#include "utils/log.hh"`, it is obvious that this source file
needs the logging system, instead of its own log facility -- please
note, we do have two other `log.hh` in the tree.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-10-22 06:54:46 +03:00
Kefu Chai
5cd619a60c treewide: s/boost::adaptors::map_keys/std::views::keys/
now that we are allowed to use C++23. we now have the luxury of using
`std::views::keys`.

in this change, we:

- replace `boost::adaptors::map_keys` with `std::views::keys`
- update affected code to work with `std::views::keys`

to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.

this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21198
2024-10-21 12:47:52 +03:00
Avi Kivity
c3be2489ce treewide: drop includes of <boost/range/adaptors.hpp>
This includes way too much, including <boost/regex.hpp>, which is huge.
Drop includes of adaptors.hpp and replace by what is needed.

Closes scylladb/scylladb#21187
2024-10-20 17:17:11 +03:00
Botond Dénes
b6da82dba3 Merge 'build: build seastar as an external project' from Kefu Chai
before this change, scylla's CMake-based system consumes Seastar
library by including it directly. but this failed to address the needs
of linking against Seastar shared libraries in Debug and Dev builds, while
linking against the static libraries in other builds. because Seastar
uses `BUILD_SHARED_LIBS` CMake variable to determine if it builds
shared libraries. and we cannot assign different values to this
CMake variable based on current configure type -- CMake does not
support. see https://gitlab.kitware.com/cmake/cmake/-/issues/19467

in order to address this problem, we have a couple possible
solutions:

- to enable Seastar to build both shared and static libraries in a
  pass. without sacrificing the performance, we have to build
  all object files twice: once with -fPIC, once without. in order
  to accompolish this goal, we need to develop a machinary to
  populate the same settings to these two builds. this would
  complicate the design of Seastar's building system further.
- to build Seastar libraries twice in scylla, we could use
  the ExternalProject module to implement this. but it'd be
  complicate to extract the compile options, and link options
  previously populated by Seastar's targets with CMake --
  we would have to replicate all of them in scylla. this is
  out of the question.
- to build Seastar libraries twice before building scylla,
  and let scylla to consume them using CMake config files or
  .pc files. this is a compromise. it enables scylla to
  drive the build of Seastar libraries and to consume
  the compile options and link options. the downside is:

  * the generated compilation database (compile_commands.json)
    does not include the commands building Seastar anymore.
  * the building system of scylla does not have finer graind
    control on the building process of seastar. for instance,
    we cannot specify the build dependency to a certain seastar
    library, and just build it instead of building the whole
    seastar project.

turns out the last approach is the best one we can have
at this moment. this is also the approach used by the existing
`configure.py`.

in this change, we

- add FindSeastar.cmake to

  * detect the preconfigured Seastar builds, and
  * extract the build options from .pc files
  * expose library targets to be consumed by parent project
- add Seastar as an external project, so we can build it from
  the parent project.

  this is atypical compared to standard ExternalProject usage:
  - Seastar's build system should already be configured at this point.
  - We maintain separate project variants for each configuration type.

  Benefits of this approach:
  - Allows the parent project to consume the compile options exposed by
    .pc file. as the compile options vary from one config to another.
  - Allows application of config-specific settings
  - Enables building Seastar within the parent project's build system
  - Facilitates linking of artifacts with the external project target,
    establishing proper dependencies between them

we will update `configure.py` to merge the compilation database
of scylla and seastar.

Refs scylladb/scylladb#2717

---

this is a CMake-related change, hence no need to backport.

Closes scylladb/scylladb#21131

* github.com:scylladb/scylladb:
  build: cmake: use GENERATOR_IS_MULTI_CONFIG property to detect mult-config
  build: cmake: consume Seastar using its .pc files
  build: do not use `mode` as the index into `modes`
  build: cmake: detect and link against GnuTLS library
  build: cmake: detect and link against yaml-cpp
  build: cmake: link Seastar with Seastar::<COMPONENT>
  build: cmake: define CMake generate helper funcs in scylla
2024-10-18 09:42:59 +03:00
Botond Dénes
6811411288 Merge 'Sanitize commitlog API endpoints' from Pavel Emelyanov
Endpoints are registered next to the service they use, and the unregistration deferred action is created right after it. When registered, the service in question is passed as argument and then captured by enpoints lambdas. This makes sure that service is not used by endpoints after being stopped.

That's not so for commitlog endpoints. These are registered in several places, and /commitlog "function" is not unregistered on stop. This patch fixes some of this misbehavior, in particular:

 -  adds unregistration of commitlog API function
 -  uses sharded<database>& argument in endpoints instead of ctx.db
 -  moves some endpoints from storage_service.cc to commitlog.cc

Closes scylladb/scylladb#21053

* github.com:scylladb/scylladb:
  api: Use captured database, not the one from ctx
  api: Pass sharded<database> to commitlog endpoints registration
  api: Move commitlog-related from storage_service.cc
  api: Unset commitlog API endpoints
  api: Extract set_server_commitlog() from set_server_done()
2024-10-18 08:56:13 +03:00
Kefu Chai
b2dc261841 build: cmake: define CMake generate helper funcs in scylla
before this change, we assume that scylla's CMake script includes
Seastar's CMake script.

but we are going to consume Seastar using its .pc files or its CMake
config files instead of including it directly. more over these helper
functions are not part of Seastar's public interface.

actually the same applies to the `check_headers()` helper, which was
adapted from seastar's CheckHeaders.cmake.

so to be prepared for this change, let's define these generate helper
functions in scylla.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-10-18 08:36:52 +08:00