before this change, we rely on `using namespace seastar` to use
`seastar::format()` without qualifying the `format()` with its
namespace. this works fine until we changed the parameter type
of format string `seastar::format()` from `const char*` to
`fmt::format_string<...>`. this change practically invited
`seastar::format()` to the club of `std::format()` and `fmt::format()`,
where all members accept a templated parameter as its `fmt`
parameter. and `seastar::format()` is not the best candidate anymore.
despite that argument-dependent lookup (ADT for short) favors the
function which is in the same namespace as its parameter, but
`using namespace` makes `seastar::format()` more competitive,
so both `std::format()` and `seastar::format()` are considered
as the condidates.
that is what is happening scylladb in quite a few caller sites of
`format()`, hence ADT is not able to tell which function the winner
in the name lookup:
```
/__w/scylladb/scylladb/mutation/mutation_fragment_stream_validator.cc:265:12: error: call to 'format' is ambiguous
265 | return format("{} ({}.{} {})", _name_view, s.ks_name(), s.cf_name(), s.id());
| ^~~~~~
/usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/format:4290:5: note: candidate function [with _Args = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
4290 | format(format_string<_Args...> __fmt, _Args&&... __args)
| ^
/__w/scylladb/scylladb/seastar/include/seastar/core/print.hh:143:1: note: candidate function [with A = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
143 | format(fmt::format_string<A...> fmt, A&&... a) {
| ^
```
in this change, we
change all `format()` to either `fmt::format()` or `seastar::format()`
with following rules:
- if the caller expects an `sstring` or `std::string_view`, change to
`seastar::format()`
- if the caller expects an `std::string`, change to `fmt::format()`.
because, `sstring::operator std::basic_string` would incur a deep
copy.
we will need another change to enable scylladb to compile with the
latest seastar. namely, to pass the format string as a templated
parameter down to helper functions which format their parameters.
to miminize the scope of this change, let's include that change when
bumping up the seastar submodule. as that change will depend on
the seastar change.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Add a default constructor and a constructor which explicitly
initializes all fields of the service_level structure.
This is done in order to make sure that removal of the
marked_for_deletion field can be done safely - otherwise, for example,
service_level could be aggregate-initialized with an incomplete list of
values for the fields, and removing marked_for_deletion which is in the
middle of the struct would cause the is_static field to be initialized
with the value that was designated for marked_for_deletion.
As a bonus, make sure that marked_for_deletion and is_static bool fields
are initialized in the default constructor to false in order to avoid
potential undefined behavior.
Tenant names starting with `$` are reserved for internal ones.
Forbid creating new service level which name starts with `$`
and log a warning for existing service levels with `$` prefix.
Closesscylladb/scylladb#20122
Updates to `system.role_members` and `system.role_attributes` affect
effective service levels cache, so applying mutations to those tables
should reload the effective SL cache.
Add a second layer of service_level_controller cache which contains
role name -> effective service level mapping.
To build the mapping, controller uses first cache layer (service level
name -> service level) and 2 queries to auth tables (one to `roles` and
one to `role_members`).
Write down definitions of `service level` and `effective service level`
in service/qos/service_level_controller.hh.
Until now, effective service level was only used as result of
`LIST EFFECTIVE SERVICE LEVEL OF <role>`.
Now we want to have quick access to effective service level of
each role and introduce cache of effective sl to do it.
New definitions clarify things.
The commit also renames:
- `update_service_levels_from_distributed_data` -> `update_service_levels_cache`
Later we will introduce effective_service_level_cache, so this change
standarizes the names.
- `find_service_level` -> `find_effective_service_level`
The function actualy returns effective service level.
assert() is traditionally disabled in release builds, but not in
scylladb. This hasn't caused problems so far, but the latest abseil
release includes a commit [1] that causes a 1000 insn/op regression when
NDEBUG is not defined.
Clearly, we must move towards a build system where NDEBUG is defined in
release builds. But we can't just define it blindly without vetting
all the assert() calls, as some were written with the expectation that
they are enabled in release mode.
To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT()
macro in utils/assert.hh. This macro is always defined and is not conditional
on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release
mode.
[1] 66ef711d68Closesscylladb/scylladb#20006
Most callers of the raft group0 client interface are passing a real
source instance, so we can use the abort source reference in the client
interface. This change makes the code simpler and more consistent.
We want to call `service_level_controller::do_abort()` in all cases.
The current code (introduced in
535e5f4ae7)
calls do_abort if abort was not requested, however, since
it does so by checking the subscription bool operator,
it would miss the case where abort was already requested
before the subscription took place (in service_level_controller
ctor).
With scylladb/seastar@470b539b1c and
scylladb/seastar@8ecce18c51
we can just unconditionally call the subscription `on_abort`
method, that ensures only-once semantics, even if abort
was already requested at subscription time.
Fixesscylladb/scylladb#19075
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closesscylladb/scylladb#19929
Before this, the notification semaphore was broken() in do_abort(),
which was triggered by early abort source.
However we are going to reload sl cache on topology state reload
and it can happen after the early abort source is triggered, so
it may throw broken_semaphore exception.
We can move semaphore breaking to stop() method. Legacy update loop
is still stopped in do_abort(), so it doesn't change the order of
service level controller shutdown.
loop
In previous commit, we marked the update loop as legacy.
For compatibility reasons, we need to start legacy update loop
when the cluster is in recovery mode or it hasn't been upgraded to raft topology.
Then, in the update loop we check if all conditions are met and stop the
loop.
This commit also moves start of update loop later (after topology state is loaded) in main.cc.
There is no risk in doing it later.
Rename method which started update loop to better reflect
what it does.
Previously the method was named `update_from_distributed_data`,
however it doesn't update anything but only start the update loop,
which we are making legacy.
this change was created in the same spirit of ebff5f5d.
despite that we include Seastar as a submodule, Seastar is not a
part of scylla project. so we'd better include its headers using
brackets.
ebff5f5d addressed this cosmetic issue a while back. but probably
clangd's header-insertion helped some of contributor to insert
the missing headers with `"`. so this style of `include` returned
to the tree with these new changes.
unfortunately, clangd does not allow us to configure the style
of `include` at the time of writing.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#19406
This description is readable from raft log table.
Previously single description was provided for the whole
announce call but since it can contain mutations from
various subsystems now description was moved to
add_mutation(s)/add_generator function calls.
Raft service levels are read-only in recovery mode. This patch adds check and proper error message when a user tries to modify service levels in recovery mode.
Fixes https://github.com/scylladb/scylladb/issues/18827Closesscylladb/scylladb#18841
* github.com:scylladb/scylladb:
test/auth_cluster/test_raft_service_levels: try to create sl in recovery
service/qos/raft_sl_dda: reject changes to service levels in recovery mode
service/qos/raft_sl_dda: extract raft_sl_dda steps to common function
mode
When a cluster goes into recovery mode and service levels were migrated
to raft, service levels become temporarily read-only.
This commit adds a proper error message in case a user tries to do any
changes.
When setting/dropping a service level using raft data accessor, the same
validation steps are executed (this_shard_id = 0 and guard is present).
To not duplicate the calls in both functions, they can be extracted to a
helper function.
The draining now only consists of waiting for the data update future to
resolve. It can be safely moved to .stop() (i.e. -- later) because its
stopping had already been initiated by abort-source, and no other
services depend on sl-controller to be stopped and drained.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Draining sl controller consists of two parts -- first, kicks the wrap-up
process by aborting operations, breaking semaphores, etc. It's
no-waiting part. At last there goes co_await of the completion future.
This part moves the no-waiting part into recently introduced abort
subscription, so that wrap-up starts few bits earlier.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There's stop-signal in main that fires an abort source on stop. Lots of
other services are subscribed in it, add the sl-controller too. For now
it's a no-op, but next patches will make use of it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The on_leave_cluster() callback needs to check if the leaving node is
the local one. It currently compares endpoint with the my_address()
obtained via pretty long dependency chain of
auth_service->query_processor->storage_proxy->database->token_metadata
This patch makes the whole thing _much_ shorter.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
To avoid conflicts arising from the discrepancy between different
versions of the repository, use coroutines instead of continuations
in service_level_controller::notify_service_level_removed().
Closesscylladb/scylladb#18525
This pull request introduces host ID in the Hinted Handoff module. Nodes are now identified by their host IDs instead of their IPs. The conversion occurs on the boundary between the module and `storage_proxy.hh`, but aside from that, IPs have been erased.
The changes take into considerations that there might still be old hints, still identified by IPs, on disk – at start-up, we map them to host IDs if it's possible so that they're not lost.
Refs scylladb/scylladb#6403Fixesscylladb/scylladb#12278Closesscylladb/scylladb#15567
* github.com:scylladb/scylladb:
docs: Update Hinted Handoff documentation
db/hints: Add endpoint_downtime_not_bigger_than()
db/hints: Migrate hinted handoff when cluster feature is enabled
db/hints: Handle arbitrary directories in resource manager
db/hints: Start using hint_directory_manager
db/hints: Enforce providing IP in get_ep_manager()
db/hints: Introduce hint_directory_manager
db/hints/resource_manager: Update function description
db/hints: Coroutinize space_watchdog::scan_one_ep_dir()
db/hints: Expose update lock of space watchdog
db/hints: Add function for migrating hint directories to host ID
db/hints: Take both IP and host ID when storing hints
db/hints: Prepare initializing endpoint managers for migrating from IP to host ID
db/hints: Migrate to locator::host_id
db/hints: Remove noexcept in do_send_one_mutation()
service: Add locator::host_id to on_leave_cluster
service: Fix indentation
db/hints: Fix indentation
We extend the function
endpoint_lifecycle_subscriber::on_leave_cluster
by another argument -- locator::host_id.
It's more convenient to have a consistent
pair of IP and host ID.
in in {fmt} before v10, it provides the specialization of `fmt::formatter<..>`
for `std::string_view` as well as the specialization of `fmt::formatter<..>`
for `fmt::string_view` which is an implementation builtin in {fmt} for
compatibility of pre-C++17. and this type is used even if the code is
compiled with C++ stadandard greater or equal to C++17. also, before v10,
the `fmt::formatter<std::string_view>::format()` is defined so it accepts
`std::string_view`. after v10, `fmt::formatter<std::string_view>` still
exists, but it is now defined using `format_as()` machinery, so it's
`format()` method does not actually accept `std::string_view`, it
accepts `fmt::string_view`, as the former can be converted to
`fmt::string_view`.
this is why we can inherit from `fmt::formatter<std::string_view>` and
use `formatter<std::string_view>::format(foo, ctx);` to implement the
`format()` method with {fmt} v9, but we cannot do this with {fmt} v10,
and we would have following compilation failure:
```
FAILED: service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o
/home/kefu/.local/bin/clang++ -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSCYLLA_BUILD_MODE=release -DSEASTAR_API_LEVEL=7 -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"RelWithDebInfo\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -ffunction-sections -fdata-sections -O3 -g -gz -std=gnu++20 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -mllvm -inline-threshold=2500 -fno-slp-vectorize -U_FORTIFY_SOURCE -Werror=unused-result -MD -MT service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -MF service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o.d -o service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -c /home/kefu/dev/scylladb/service/topology_state_machine.cc
/home/kefu/dev/scylladb/service/topology_state_machine.cc:254:41: error: no matching member function for call to 'format'
254 | return formatter<std::string_view>::format(it->second, ctx);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
/usr/include/fmt/core.h:2759:22: note: candidate function template not viable: no known conversion from 'seastar::basic_sstring<char, unsigned int, 15>' to 'const fmt::basic_string_view<char>' for 1st argument
2759 | FMT_CONSTEXPR auto format(const T& val, FormatContext& ctx) const
| ^ ~~~~~~~~~~~~
```
because the inherited `format()` method actually comes from
`fmt::formatter<fmt::string_view>`. to reduce the confusion, in this
change, we just inherit from `fmt::format<string_view>`, where
`string_view` is actually `fmt::string_view`. this follows
the document at
https://fmt.dev/latest/api.html#formatting-user-defined-types,
and since there is less indirection under the hood -- we do not
use the specialization created by `FMT_FORMAT_AS` which inherit
from `formatter<fmt::string_view>`, hopefully this can improve
the compilation speed a little bit. also, this change addresses
the build failure with {fmt} v10.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#18299
Create `raft_service_levels_distributed_data_accessor` if service levels
were migrated to v2 table.
This supports raft recovery mode, as service levels will be read from v2
table in the mode.
Save information whether service levels data was migrated to v2 table.
The information is stored in `system.scylla_local` table. It's
written with raft command and included in raft snapshot.
Migrate data from `system_distributes.service_levels` to
`system.service_levels_v2` during raft topology upgrade.
Migration process reads data from old table with CL ALL
and inserts the data to the new table via raft.
`raft_service_level_distributed_data_accessor` works this way:
- on read path it reads service levels from `SYSTEM.SERVICE_LEVELS_V2`
table with CL = LOCAL_ONE
- on write path it starts group0 operation and it makes the change
using raft command
Adjust service_level_controller and
service_level_controller::service_level_distributed_data_accessor
interfaces to take `group0_guard` while adding/altering/dropping a
service level.
To migrate service levels to be raft managed, obtain `group0_guard` to
be able to pass it to service_level_controller's methods.
Using this mechanism also automatically provides retries in case of
concurrent group0 operation.