scylladb

Author	SHA1	Message	Date
Gleb Natapov	f888f2dced	service level: remove remnants of version 1 service level can_use_effective_service_level_cache() always returns true now, so the function can be dropped entirely and all the code that assumes it may return false can be dropped as well.	2026-03-12 12:27:52 +02:00
Gleb Natapov	b59b3d4f8a	service level: remove version 1 service level code	2026-03-10 10:46:48 +02:00
Pavel Emelyanov	18b5a49b0c	Populate all sl:* groups into dedicated top-level supergroup This patch changes the layout of user-facing scheduling groups from / `- statement `- sl:default `- sl:* `- other groups (compaction, streaming, etc.) into / `- user (supergroup) `- statement `- sl:default `- sl:* `- other groups (compaction, streaming, etc.) The new supergroup has 1000 static shares and is name-less, in a sense that it only have a variable in the code to refer to and is not exported via metrics (should be fixed in seastar if we want to). The moved groups don't change their names or shares, only move inside the scheduling hierarchy. The goal of the change is to improve resource consumption of sl:* groups. Right now activities in low-shares service levels are scheduled on-par with e.g. streaming activity, which is considered to be low-prio one. By moving all sl:* groups into their own supergroup with 1000 shares changes the meaning of sl:* shares. From now on these shares values describe preirities of service level between each-other, and the user activities compete with the rest of the system with 1000 shares, regardless of how many service levels are there. Unit tests keep their user groups under root supergroup (for simplicity) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#28235	2026-01-21 14:14:48 +02:00
Piotr Dulikowski	2bb800c004	qos: don't populate effective service level cache until auth is migrated to raft Right now, service levels are migrated in one group0 command and auth is migrated in the next one. This has a bad effect on the group0 state reload logic - modifying service levels in group0 causes the effective service levels cache to be recalculated, and to do so we need to fetch information about all roles. If the reload happens after SL upgrade and before auth upgrade, the query for roles will be directed to the legacy auth tables in system_auth - and the query, being a potentially remote query, has a timeout. If the query times out, it will throw an exception which will break the group0 apply fiber and the node will need to be restarted to bring it back to work. In order to solve this issue, make sure that the service level module does not start populating and using the service level cache until both service levels and auth are migrated to raft. This is achieved by adding the check both to the cache population logic and the effective service level getter - they now look at service level's accessor new method, `can_use_effective_service_level_cache` which takes a look at the auth version. Fixes: scylladb/scylladb#24963	2025-07-29 11:37:37 +02:00
Benny Halevy	2c0bafb934	token_metadata: clear_and_destroy_impl when destroyed We have a lot of places in the code where a token_metadata_ptr is kept in an automatic variable and destroyed when it leaves the scope. since it's a referenced counted lw_shared_ptr, the token_metadata object is rarely destroyed in those cases, but when it is, it doesn't go through clear_gently, and in particular its tablet_metadata is not cleared gently, leading to inefficient destruction of potentially many foreign_ptr:s. This patch calls clear_and_destroy_impl that gently clears and destroys the impl object in the background using the shared_token_metadata. Fixes #13381 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-07-06 15:07:31 +03:00
Piotr Dulikowski	bbc655ff32	test/boost: update service_level_controller_test for workload prio Adjust some of the existing tests in service_level_controller_test.cc and add some more in order to test the workload prioritization features, i.e. the service level shares.	2025-01-02 07:13:34 +01:00
Piotr Dulikowski	4cfd26efaf	qos: manage and assign scheduling groups to service levels Introduce the core logic of workload prioritization, responsible for assigning scheduling groups to service levels. The service level controller maintains a pool of scheduling groups for the currently present service levels, as well as a pool of unused scheduling groups which were previously used by some service level that was deleted during node's lifetime. When a new service level is created, the SL controller either assigns a scheduling group from the unused SG pool, or creates a new one if the pool is empty. The scheduling group is renamed to "sl:<scheduling group name>". When updating shares of a service level (and also when creating a new service level), the shares of the corresponding scheduling group are synchronized with those of the service level. When a service level is deleted, its group is released to the aforementioned pool of unused scheduling groups and the prefix of its name is changed from "sl:" to "sl_deleted:". For now, these scheduling groups are not used by any user operations. This will be changed in subsequent commits.	2025-01-02 07:13:34 +01:00
Avi Kivity	eb62593f2c	treewide: use angle brackets when including seastar headers We treat Seastar as a "system" library, and those are included with angle brackets. Closes scylladb/scylladb#21959	2024-12-20 16:16:28 +02:00
Avi Kivity	f3eade2f62	treewide: relicense to ScyllaDB-Source-Available-1.0 Drop the AGPL license in favor of a source-available license. See the blog post [1] for details. [1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/	2024-12-18 17:45:13 +02:00
Gleb Natapov	0882f2024c	locator: topology: make topology object always contain local node Currently the locator::topology object, when created, does not contain local node, but it is started to be used to access local database. It sort of work now because there are explicit checks in the code to handle this special case like in topology::get_location for instance. We do not want to hack around it and instead rely on an invariant that the local node is always there. To do that we add local node during locator::topology creation. There is a catch though. Unlike with IP host ID is not known during startup. We actually need to read from the database to know it, so the topology starts with host ID zero and then it changes once to the real one. This is not a problem though. As long as the (one node) topology is consistent (_cfg.this_host_id is equal to the node's id) local access will work.	2024-12-02 10:31:11 +02:00
Michał Jadwiszczak	664a1913c6	service/qos/service_level_controller: notify subscribers on effective cache reloaded Add event representing reload of effective service level cache and notify subscribers when the cache is reloaded.	2024-08-08 10:42:09 +02:00
Kefu Chai	bfe918ac9e	test/boost: define fmt::formatter for service_level_controller_test.cc since we are moving away for operator<< based formatter, more and more types now only have {fmt} based formatters. the same will apply to the STL container types after ditching the generic homebrew formatter in to_string.hh, so to be prepared for the change, let's add the fmt::formatter for tests as well. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-05-26 12:32:43 +08:00
Kefu Chai	222dbf2ce4	test/boost: include test/lib/test_utils.hh this change was created in the same spirit of 505900f18f. because we are deprecating the operator<< for vector and unorderd_map in Seastar, some tests do not compile anymore if we disable these operators. so to be prepared for the change disabling them, let's include test/lib/test_utils.hh for accessing the printer dedicated for Boost.test. and also '#include <fmt/ranges.h>' when necessary, because, in order to format the ranges using {fmt}, we need to use fmt/ranges.h. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-05-26 12:32:43 +08:00
Pavel Emelyanov	8d4c8711fa	main,sl_controller: Subscribe for early abort There's stop-signal in main that fires an abort source on stop. Lots of other services are subscribed in it, add the sl-controller too. For now it's a no-op, but next patches will make use of it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-05-20 21:26:31 +03:00
Pavel Emelyanov	634c066c43	service_level_controller: Add dependency on shared_token_metadata The controller needs to access topology, so it needs the token metadata at hand. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-05-14 15:43:01 +03:00
Kefu Chai	97587a2ea4	test/boost: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17139	2024-02-06 13:22:16 +02:00
Kefu Chai	f5b05cf981	treewide: use defaulted operator!=() and operator==() in C++20, compiler generate operator!=() if the corresponding operator==() is already defined, the language now understands that the comparison is symmetric in the new standard. fortunately, our operator!=() is always equivalent to `! operator==()`, this matches the behavior of the default generated operator!=(). so, in this change, all `operator!=` are removed. in addition to the defaulted operator!=, C++20 also brings to us the defaulted operator==() -- it is able to generated the operator==() if the member-wise lexicographical comparison. under some circumstances, this is exactly what we need. so, in this change, if the operator==() is also implemented as a lexicographical comparison of all memeber variables of the class/struct in question, it is implemented using the default generated one by removing its body and mark the function as `default`. moreover, if the class happen to have other comparison operators which are implemented using lexicographical comparison, the default generated `operator<=>` is used in place of the defaulted `operator==`. sometimes, we fail to mark the operator== with the `const` specifier, in this change, to fulfil the need of C++ standard, and to be more correct, the `const` specifier is added. also, to generate the defaulted operator==, the operand should be `const class_name&`, but it is not always the case, in the class of `version`, we use `version` as the parameter type, to fulfill the need of the C++ standard, the parameter type is changed to `const version&` instead. this does not change the semantic of the comparison operator. and is a more idiomatic way to pass non-trivial struct as function parameters. please note, because in C++20, both operator= and operator<=> are symmetric, some of the operators in `multiprecision` are removed. they are the symmetric form of the another variant. if they were not removed, compiler would, for instance, find ambiguous overloaded operator '=='. this change is a cleanup to modernize the code base with C++20 features. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #13687	2023-04-27 10:24:46 +03:00
Raphael S. Carvalho	3c5afb2d5c	test: Enable Scylla test command line options for boost tests We have enabled the command line options without changing a single line of code, we only had to replace old include with scylla_test_case.hh. Next step is to add x-log-compaction-groups options, which will determine the number of compaction groups to be used by all instantiations of replica::table. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-02-01 20:14:51 -03:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Eliran Sinvani	c38ceafdcf	Service Level Controller: Add an extention point to the API (#9374 ) In order to ease future extensions to the information being sent by the service level configuration change API, we pack the additional parameters (other the the service level options) to the interface in a structure. This will allow an easy expansion in the future if more parameters needs to be sent to the observer.i Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>	2021-10-01 10:20:28 +03:00
Eliran Sinvani	403db8e943	service level controller: Subscriber API unit test Here we add a very simple unit test for the configuration change API.	2021-08-16 11:38:59 +03:00

21 Commits