Commit Graph

73 Commits

Author SHA1 Message Date
Michał Jadwiszczak
4e236f3392 cql3/statements/create_service_level: forbid creating SL starting with $
Tenant names starting with `$` are reserved for internal ones.
Forbid creating new service level which name starts with `$`
and log a warning for existing service levels with `$` prefix.

(cherry picked from commit d729d1b272)

Closes scylladb/scylladb#20198
2024-08-19 16:20:48 +03:00
Emil Maskovsky
0770069dda raft: use the abort source reference in raft group0 client interface
Most callers of the raft group0 client interface are passing a real
source instance, so we can use the abort source reference in the client
interface. This change makes the code simpler and more consistent.

(cherry picked from commit 2dbe9ef2f2)
2024-08-01 19:36:00 +02:00
Michał Jadwiszczak
6d655e6766 service/qos/raft_sl_dda: reject changes to service levels in recovery
mode

When a cluster goes into recovery mode and service levels were migrated
to raft, service levels become temporarily read-only.

This commit adds a proper error message in case a user tries to do any
changes.

(cherry picked from commit ee08d7fdad)
2024-05-27 18:20:36 +00:00
Michał Jadwiszczak
54b9fdab03 service/qos/raft_sl_dda: extract raft_sl_dda steps to common function
When setting/dropping a service level using raft data accessor, the same
validation steps are executed (this_shard_id = 0 and guard is present).
To not duplicate the calls in both functions, they can be extracted to a
helper function.

(cherry picked from commit 2b56158d13)
2024-05-27 18:20:36 +00:00
Pavel Emelyanov
a30337e719 service_level_controller_test: Use topology::is_me() helper
The on_leave_cluster() callback needs to check if the leaving node is
the local one. It currently compares endpoint with the my_address()
obtained via pretty long dependency chain of

  auth_service->query_processor->storage_proxy->database->token_metadata

This patch makes the whole thing _much_ shorter.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-05-14 15:47:12 +03:00
Pavel Emelyanov
634c066c43 service_level_controller: Add dependency on shared_token_metadata
The controller needs to access topology, so it needs the token metadata
at hand.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2024-05-14 15:43:01 +03:00
Wojciech Mitros
8472c46c8a service_level_controller: coroutinize notify_service_level_removed
To avoid conflicts arising from the discrepancy between different
versions of the repository, use coroutines instead of continuations
in service_level_controller::notify_service_level_removed().

Closes scylladb/scylladb#18525
2024-05-06 14:20:49 +03:00
Piotr Dulikowski
64ba620dc2 Merge 'hinted handoff: Use host IDs instead of IPs in the module' from Dawid Mędrek
This pull request introduces host ID in the Hinted Handoff module. Nodes are now identified by their host IDs instead of their IPs. The conversion occurs on the boundary between the module and `storage_proxy.hh`, but aside from that, IPs have been erased.

The changes take into considerations that there might still be old hints, still identified by IPs, on disk – at start-up, we map them to host IDs if it's possible so that they're not lost.

Refs scylladb/scylladb#6403
Fixes scylladb/scylladb#12278

Closes scylladb/scylladb#15567

* github.com:scylladb/scylladb:
  docs: Update Hinted Handoff documentation
  db/hints: Add endpoint_downtime_not_bigger_than()
  db/hints: Migrate hinted handoff when cluster feature is enabled
  db/hints: Handle arbitrary directories in resource manager
  db/hints: Start using hint_directory_manager
  db/hints: Enforce providing IP in get_ep_manager()
  db/hints: Introduce hint_directory_manager
  db/hints/resource_manager: Update function description
  db/hints: Coroutinize space_watchdog::scan_one_ep_dir()
  db/hints: Expose update lock of space watchdog
  db/hints: Add function for migrating hint directories to host ID
  db/hints: Take both IP and host ID when storing hints
  db/hints: Prepare initializing endpoint managers for migrating from IP to host ID
  db/hints: Migrate to locator::host_id
  db/hints: Remove noexcept in do_send_one_mutation()
  service: Add locator::host_id to on_leave_cluster
  service: Fix indentation
  db/hints: Fix indentation
2024-05-06 09:58:18 +02:00
Benny Halevy
ebff5f5d70 everywhere: include seastar headers using angle brackets
seastar is an external library therefore it should
use the system-include syntax.

Closes scylladb/scylladb#18513
2024-05-06 10:00:31 +03:00
Dawid Medrek
54ae9797b9 service: Add locator::host_id to on_leave_cluster
We extend the function
endpoint_lifecycle_subscriber::on_leave_cluster
by another argument -- locator::host_id.
It's more convenient to have a consistent
pair of IP and host ID.
2024-04-26 22:44:03 +02:00
Kefu Chai
168ade72f8 treewide: replace formatter<std::string_view> with formatter<string_view>
in in {fmt} before v10, it provides the specialization of `fmt::formatter<..>`
for `std::string_view` as well as the specialization of `fmt::formatter<..>`
for `fmt::string_view` which is an implementation builtin in {fmt} for
compatibility of pre-C++17. and this type is used even if the code is
compiled with C++ stadandard greater or equal to C++17. also, before v10,
the `fmt::formatter<std::string_view>::format()` is defined so it accepts
`std::string_view`. after v10, `fmt::formatter<std::string_view>` still
exists, but it is now defined using `format_as()` machinery, so it's
`format()` method does not actually accept `std::string_view`, it
accepts `fmt::string_view`, as the former can be converted to
`fmt::string_view`.

this is why we can inherit from `fmt::formatter<std::string_view>` and
use `formatter<std::string_view>::format(foo, ctx);` to implement the
`format()` method with {fmt} v9, but we cannot do this with {fmt} v10,
and we would have following compilation failure:

```
FAILED: service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o
/home/kefu/.local/bin/clang++ -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSCYLLA_BUILD_MODE=release -DSEASTAR_API_LEVEL=7 -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"RelWithDebInfo\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -ffunction-sections -fdata-sections -O3 -g -gz -std=gnu++20 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -mllvm -inline-threshold=2500 -fno-slp-vectorize -U_FORTIFY_SOURCE -Werror=unused-result -MD -MT service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -MF service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o.d -o service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -c /home/kefu/dev/scylladb/service/topology_state_machine.cc
/home/kefu/dev/scylladb/service/topology_state_machine.cc:254:41: error: no matching member function for call to 'format'
  254 |     return formatter<std::string_view>::format(it->second, ctx);
      |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
/usr/include/fmt/core.h:2759:22: note: candidate function template not viable: no known conversion from 'seastar::basic_sstring<char, unsigned int, 15>' to 'const fmt::basic_string_view<char>' for 1st argument
 2759 |   FMT_CONSTEXPR auto format(const T& val, FormatContext& ctx) const
      |                      ^      ~~~~~~~~~~~~
```

because the inherited `format()` method actually comes from
`fmt::formatter<fmt::string_view>`. to reduce the confusion, in this
change, we just inherit from `fmt::format<string_view>`, where
`string_view` is actually `fmt::string_view`. this follows
the document at
https://fmt.dev/latest/api.html#formatting-user-defined-types,
and since there is less indirection under the hood -- we do not
use the specialization created by `FMT_FORMAT_AS` which inherit
from `formatter<fmt::string_view>`, hopefully this can improve
the compilation speed a little bit. also, this change addresses
the build failure with {fmt} v10.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18299
2024-04-19 07:44:07 +03:00
Kefu Chai
e2f3fed373 service: qos: fix a typo
s/accesor/accessor/

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18124
2024-04-03 10:33:54 +02:00
Marcin Maliszkiewicz
ff17a29b54 service: qos: create separate function for reloading data accessor
Scylla's main is already too long, it's better to contain this logic inside qos service.
2024-03-26 17:26:19 +01:00
Michał Jadwiszczak
a08918a320 main: create raft dda if sl data was migrated
Create `raft_service_levels_distributed_data_accessor` if service levels
were migrated to v2 table.
This supports raft recovery mode, as service levels will be read from v2
table in the mode.
2024-03-21 23:14:57 +01:00
Michał Jadwiszczak
dab909b1d1 service:qos: store information about sl data migration
Save information whether service levels data was migrated to v2 table.
The information is stored in `system.scylla_local` table. It's
written with raft command and included in raft snapshot.
2024-03-21 23:14:57 +01:00
Michał Jadwiszczak
2917ec5d51 service:qos: service levels migration
Migrate data from `system_distributes.service_levels` to
`system.service_levels_v2` during raft topology upgrade.

Migration process reads data from old table with CL ALL
and inserts the data to the new table via raft.
2024-03-21 23:14:57 +01:00
Michał Jadwiszczak
159a6a2169 service:qos: fix is_v2() method 2024-03-21 23:14:57 +01:00
Michał Jadwiszczak
fd32f5162a service:qos: add a method to upgrade data accessor 2024-03-21 23:14:57 +01:00
Michał Jadwiszczak
d5fa0747d7 service:qos: add abort_source for group0 operations
Add mechanism to abort ongoing group0 operations while draining
service_level_controller or leaving the cluster.
2024-03-21 23:14:57 +01:00
Michał Jadwiszczak
7e61bbb0d5 service:qos: raft service level distributed data accessor
`raft_service_level_distributed_data_accessor` works this way:
- on read path it reads service levels from `SYSTEM.SERVICE_LEVELS_V2`
  table with CL = LOCAL_ONE
- on write path it starts group0 operation and it makes the change
  using raft command
2024-03-21 23:14:57 +01:00
Michał Jadwiszczak
71c07addb5 service:qos: use group0_guard in data accessor
Adjust service_level_controller and
service_level_controller::service_level_distributed_data_accessor
interfaces to take `group0_guard` while adding/altering/dropping a
service level.
2024-03-21 23:14:57 +01:00
Michał Jadwiszczak
da82c5f0b0 cql3:statements: run service level statements on shard0 with raft guard
To migrate service levels to be raft managed, obtain `group0_guard` to
be able to pass it to service_level_controller's methods.

Using this mechanism also automatically provides retries in case of
concurrent group0 operation.
2024-03-21 23:14:57 +01:00
Michał Jadwiszczak
c0e22fcb9c service:qos: fix indentation 2024-03-21 23:14:57 +01:00
Michał Jadwiszczak
1f3c6b2813 service:qos: coroutinize some of the methods
Functions:
- `service_level_controller::set_distributed_service_level()`
- `service_level_controller::drop_distributed_service_level()`
- `service_level_controller::drain()`

Coroutines increase readability of those functions.
2024-03-21 23:14:57 +01:00
Michał Jadwiszczak
990c5e7dd0 service:qos: extract common service levels' table functions
Getting a service level(s) will be done the same way in raft-based
service levels as it's done in standard service levels, so those
funtions are extracted to reused it.
2024-03-21 23:14:57 +01:00
Kefu Chai
6244a2ae00 service:qos: add fmt::formatter for service_level_options::workload_type
this change prepares for the fmt::formatter based formatter used by
tests, which will use {fmt} to print the elements in a container,
so we need to define the formatter using fmt::formatter for these
element. the operator<< for service_level_options::workload_type is
preserved, as the tests are still using it.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17837
2024-03-17 16:52:57 +02:00
Marcin Maliszkiewicz
9f172f1843 auth: coroutinize functions in standard_role_manager
Affected functions are: find_record, create_default_role_if_missing,
create_or_replace, drop, modify_membership, query_all, get_attribute,
set_attribute, remove_attribute
2024-03-01 16:25:14 +01:00
Avi Kivity
7cb1c10fed treewide: replace seastar::future::get0() with seastar::future::get()
get0() dates back from the days where Seastar futures carried tuples, and
get0() was a way to get the first (and usually only) element. Now
it's a distraction, and Seastar is likely to deprecate and remove it.

Replace with seastar::future::get(), which does the same thing.
2024-02-02 22:12:57 +08:00
Kefu Chai
b931d93668 treewide: fix misspellings in code comments
these misspellings are identified by codespell.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17004
2024-01-31 09:16:10 +02:00
Botond Dénes
f22fc88a64 Merge 'Configure service levels interval' from Michał Jadwiszczak
Service level controller updates itself in interval. However the interval time is hardcoded in main to 10 seconds and it leads to long sleeps in some of the tests.

This patch moves this value to `service_levels_interval_ms` command line option and sets this value to 0.5s in cql-pytest.

Closes scylladb/scylladb#16394

* github.com:scylladb/scylladb:
  test:cql-pytest: change service levels intervals in tests
  configure service levels interval
2024-01-17 12:24:49 +02:00
Kefu Chai
ece2bd2f6e service: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16764
2024-01-15 13:29:33 +02:00
Michał Jadwiszczak
f6a464ad81 configure service levels interval
So far the service levels interval, responsible for updating SL configuration,
was hardcoded in main.
Now it's extracted to `service_levels_interval_ms` option.
2024-01-12 10:28:24 +01:00
Benny Halevy
0b310c471c service_level_controller: use locator::topology rather than fb_utilities
Expose cql3::query_processor in auth::service
to get to the topology via storage_proxy.replica::database

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2023-12-05 10:17:47 +02:00
Yaniv Kaul
c658bdb150 Typos: fix typos in comments
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.

Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
2023-12-02 22:37:22 +02:00
Michał Jadwiszczak
1b08338fe7 service:qos: add option to include effective names to SLO
Allow to include `slo_effective_names` in `service_level_options`
to be able to determine from which service level the specific option value comes from.
2023-11-30 13:07:20 +01:00
Pavel Emelyanov
66e43912d6 code: Switch to seastar API level 7
In that level no io_priority_class-es exist. Instead, all the IO happens
in the context of current sched-group. File API no longer accepts prio
class argument (and makes io_intent arg mandatory to impls).

So the change consists of
- removing all usage of io_priority_class
- patching file_impl's inheritants to updated API
- priority manager goes away altogether
- IO bandwidth update is performed on respective sched group
- tune-up scylla-gdb.py io_queues command

The first change is huge and was made semi-autimatically by:
- grep io_priority_class | default_priority_class
- remove all calls, found methods' args and class' fields

Patching file_impl-s is smaller, but also mechanical:
- replace io_priority_class& argument with io_intent* one
- pass intent to lower file (if applicatble)

Dropping the priority manager is:
- git-rm .cc and .hh
- sed out all the #include-s
- fix configure.py and cmakefile

The scylla-gdb.py update is a bit hairry -- it needs to use task queues
list for IO classes names and shares, but to detect it should it checks
for the "commitlog" group is present.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #13963
2023-06-06 13:29:16 +03:00
Kefu Chai
f5b05cf981 treewide: use defaulted operator!=() and operator==()
in C++20, compiler generate operator!=() if the corresponding
operator==() is already defined, the language now understands
that the comparison is symmetric in the new standard.

fortunately, our operator!=() is always equivalent to
`! operator==()`, this matches the behavior of the default
generated operator!=(). so, in this change, all `operator!=`
are removed.

in addition to the defaulted operator!=, C++20 also brings to us
the defaulted operator==() -- it is able to generated the
operator==() if the member-wise lexicographical comparison.
under some circumstances, this is exactly what we need. so,
in this change, if the operator==() is also implemented as
a lexicographical comparison of all memeber variables of the
class/struct in question, it is implemented using the default
generated one by removing its body and mark the function as
`default`. moreover, if the class happen to have other comparison
operators which are implemented using lexicographical comparison,
the default generated `operator<=>` is used in place of
the defaulted `operator==`.

sometimes, we fail to mark the operator== with the `const`
specifier, in this change, to fulfil the need of C++ standard,
and to be more correct, the `const` specifier is added.

also, to generate the defaulted operator==, the operand should
be `const class_name&`, but it is not always the case, in the
class of `version`, we use `version` as the parameter type, to
fulfill the need of the C++ standard, the parameter type is
changed to `const version&` instead. this does not change
the semantic of the comparison operator. and is a more idiomatic
way to pass non-trivial struct as function parameters.

please note, because in C++20, both operator= and operator<=> are
symmetric, some of the operators in `multiprecision` are removed.
they are the symmetric form of the another variant. if they were
not removed, compiler would, for instance, find ambiguous
overloaded operator '=='.

this change is a cleanup to modernize the code base with C++20
features.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13687
2023-04-27 10:24:46 +03:00
Kefu Chai
0cb842797a treewide: do not define/capture unused variables
these warnings are found by Clang-17 after removing
`-Wno-unused-lambda-capture` and '-Wno-unused-variable' from
the list of disabled warnings in `configure.py`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-02-15 22:57:18 +02:00
Avi Kivity
afc06f0017 messaging: forward-declare types in messaging_service.hh
messaging_service.hh is a switchboard - it includes many things,
and many things include it. Therefore, changes in the things it
includes affect many translation units.

Reduce the dependencies by forward-declaring as much as possible.
This isn't pretty, but it reduces compile time and recompilations.

Other headers adjusted as needed so everything (including
`ninja dev-headers`) still compile.

Closes #10755
2022-06-09 15:52:12 +03:00
MaciekCisowski
439001b8c2 service_level_controller: fix small typo in exception message
Closes #10136
2022-02-26 22:23:26 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Pavel Solodovnikov
b958e85c54 utils: atomic_vector: rename for_each to thread_for_each
To emphasize that the function requires `seastar::thread`
context to function properly.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2022-01-11 09:29:12 +03:00
Piotr Sarna
4bfaa7d9fc Merge 'Service levels: fix undefined behaviours' from Eliran Sinvani
This mini series contains two fixes that are bundled together since the
second one assumes that the first one exists (or it will not fix
anything really...), the two problems were:
1. When certain operations are called on a service level controller
   which doesn't have it's data accessor set, it can lead to a crash
since some operations will still try to dereference the accessor
pointer.
2. The cql environment test initialized the accessor with a
   sharded<system_distributed_data>& however this sharded class as
itself is not initialized (sharded::start wasn't called), so for the
same that were unsafe for null dereference the accessor will now crash
for trying to access uninitialized sharded instance.

Closes #9468

* github.com:scylladb/scylla:
  CQL test environment: Fix bad initialization order
  Service Level Controller: Fix possible dereference of a null pointer
2021-10-18 08:53:53 +02:00
Eliran Sinvani
6d3e8055f9 Service Level Controller: Fix possible dereference of a null pointer
If the service level controller don't have his data accessor set,
calls for getting of distributed information might dereference this
unset pointer for the accessor. Here we add code that will return
a result as if there is no data available to the accessor (a behaviour
which is roughly equivalent to a null data accessor).

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
2021-10-12 13:27:50 +03:00
Avi Kivity
f6d59c33ff service: service_level_controller: drop unused variable sl_compare
Reported by gcc 11.
2021-10-10 18:16:50 +03:00
Eliran Sinvani
c38ceafdcf Service Level Controller: Add an extention point to the API (#9374)
In order to ease future extensions to the information being sent
by the service level configuration change API, we pack the additional
parameters (other the the service level options) to the interface in a
structure. This will allow an easy expansion in the future if more
parameters needs to be sent to the observer.i

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
2021-10-01 10:20:28 +03:00
Eliran Sinvani
7f44736939 Service Levels: do not notify stale service level removals
Before this commit, the service_level_controller will notify
the subscribers on stale deletes, meaning, deletes of localy
non exixtent service_levels.
The code flow shouldn't ever get to such a state, but as long
as this condition is checked instead of being asserted it is
worthwhile to change the code to be safe.

Closes #9253
2021-08-26 18:27:52 +03:00
Eliran Sinvani
47d3862b63 Service Level Controller: Add a listener API for service level config
changes

This change adds an api for registering a listener for service_level
configuration chanhes. It notifies about removal addition and change of
service level.
The hidden assumption is that some listeners are going to create and/or
manage service level specific resources and this it what guided the
time of the call to the subscriber.
Addition and change of a service level are called before the actual
change takes place, this guaranties that resource creation can take
place before the service level or new config starts to be used.
The deletion notification is called only after the deletion took place
and this guranties that the service level can't be active and the
resources created can be safely destroyed.
2021-08-16 11:38:59 +03:00
Pavel Emelyanov
c39f04fa6f code: Remove storage-service header from irrelevant places
Some .cc files over the code include the storage service
for no real need. Drop the header and include (in some)
what's really needed.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-22 18:50:19 +03:00
Eliran Sinvani
ccdef39d21 Service Level Controller: Stop configuration polling loop upon leaving
the cluster

This change subscribes service_level_controller for nodes life cycle
notifications and uses the notification of leaving the cluster for
the current node to stop the configuration polling loop. If the loop
continues to run it's queries will fail consistently since the nodes
will not answers to queries. It is worth mentioning that the queries
failing in the current state of code is harmles but noisy since after
90 seconsd, if the scylla process is not shut down the failures will
start to generate failure logs every 90 seconds which is confusing for
users.

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
2021-07-14 09:31:40 +03:00