scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-28 20:27:03 +00:00

Author	SHA1	Message	Date
Kefu Chai	d0ceb35e7e	test/boost: print runtime_error using e.what() before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. but fortunately, fmt v10 brings the builtin formatter for classes derived from `std::exception`. but before switching to {fmt} v10, and after dropping `FMT_DEPRECATED_OSTREAM` macro, we need to print out `std::runtime_error`. so far, we don't have a shared place for formatter for `std::runtime_error`. so we are addressing the needs on a case-by-case basis. in this change, we just print it using `e.what()`. it's behavior is identical to what we have now. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-03-27 18:18:32 +08:00
Pavel Emelyanov	04370dc8a4	tablets: Introduce substract_sets() There are several places in code that calculate replica sets associated with specific tablet transision. Having a helper to substract two sets improves code readability. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#18033	2024-03-26 23:33:06 +02:00
Tomasz Grabiec	042a4b7627	Merge 'tablets: add warning on CREATE KEYSPACE' from Nadav Har'El The CDC feature is not supported on a table that uses tablets (Refs https://github.com/scylladb/scylladb/issues/16317), so if a user creates a keyspace with tablets enabled they may be surprised later (perhaps much later) when they try to enable CDC on the table and can't. The LWT feature always had issue Refs https://github.com/scylladb/scylladb/issues/5251, but it has become potentially more common with tablets. So it was proposed that as long as we have missing features (like CDC or LWT), every time a keyspace is created with tablets it should output a warning (a bona-fide CQL warning, not a log message) that some features are missing, and if you need them you should consider re-creating the keyspace without tablets. This PR does this. The warning text which will be produced is the following (obviously, it can be improved later, as we perhaps find more missing features): > "Tables in this keyspace will be replicated using tablets, and will > not support the CDC feature (issue https://github.com/scylladb/scylladb/issues/16317) and LWT may suffer from > issue https://github.com/scylladb/scylladb/issues/5251 more often. If you want to use CDC or LWT, please drop > this keyspace and re-create it without tablets, by adding AND TABLETS > = {'enabled': false} to the CREATE KEYSPACE statement." This PR also includes a test - that checks that this warning is is indeed generated when a keyspace is created with tablets (either by default or explicitly), and not generated if the keyspace is created without tablets. It also fixes existing tests which didn't like the new warning. Fixes https://github.com/scylladb/scylladb/issues/16807 Closes scylladb/scylladb#17318 * github.com:scylladb/scylladb: tablets: add warning on CREATE KEYSPACE test/cql-pytest: fix guadrail tests to not be sensitive to more warnings	2024-03-26 20:04:07 +01:00
Avi Kivity	4ddf82e58b	treewide: don't #include "gms/feature_service.hh" from other headers feature_service.hh is a high-level header that integrates much of the system functionality, so including it in lower-level headers causes unnecessary rebuilds. Specifically, when retiring features. Fix by removing feature_service.hh from headers, and supply forward declarations and includes in .cc where needed. Closes scylladb/scylladb#18005	2024-03-26 15:31:18 +02:00
Pavel Emelyanov	8bf9098663	system_keyspace: Consolidate node-state vs tokens checks When loading topology state, nodes are checked to have or not to have "tokens" field set. The check is done based on node state and it's spread across the loading method. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#17957	2024-03-26 14:55:46 +02:00
Avi Kivity	22b8065a89	Merge 'tools/scylla-nodetool: implement the getsstables and sstableinfo commands' from Botond Dénes These commands manage to avoid detection because they are not documented on https://opensource.docs.scylladb.com/stable/operating-scylla/nodetool.html. They were discovered when running dtests, with ccm tuned to use the native nodetool directly. See https://github.com/scylladb/scylla-ccm/pull/565. The commands come with tests, which pass with both the native and Java nodetools. I also checked that the relevant dtests pass with the native implementation. Closes scylladb/scylladb#17979 * github.com:scylladb/scylladb: tools/scylla-nodetool: implement the sstableinfo command tools/scylla-nodetool: implement the getsstables command tools/scylla-nodetool: move get_ks_cfs() to the top of the file test/nodetool: rest_api_mock.py: add expected_requests context manager	2024-03-26 14:38:00 +02:00
Kefu Chai	101fdfc33a	test: randomized_nemesis_test: add fmt::formatter for stop_crash::result_type before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. also, it's impossible to partial specialize a nested type of a template class, we cannot specialize the `fmt::formatter` for `stop_crash<M>::result_type`, as a workaround, a new type is added. in this change, * define a new type named `stop_crash_result` * add fmt::formatter for `stop_crash_result` * define stop_crash::result_type as an alias of `stop_crash_result` Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18018	2024-03-26 12:18:55 +02:00
Pavel Emelyanov	67c2a06493	api: Rename (un)set_server_load_sstable -> (un)set_server_column_family The method sets up column family API, not load-sstables one Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#18022	2024-03-26 12:16:08 +02:00
Botond Dénes	7edbf189e6	Merge 'treewide: use fmt::to_string() to transform a UUID to std::string and drop UUID::to_sstring()' from Kefu Chai `UUID::to_sstring()` relies on `FMT_DEPRECATED_OSTREAM` to generated `fmt::formatter` for `UUID`, and this feature is deprecated in {fmt} v9, and dropped in {fmt} v10. in this series, all callers of `UUID::to_sstring()` are switched to `fmt::to_string()`, and this function is dropped. Closes scylladb/scylladb#18020 * github.com:scylladb/scylladb: utils: UUID: drop UUID::to_sstring() treewide: use fmt::to_string() to transform a UUID to std::string	2024-03-26 12:14:56 +02:00
Kefu Chai	f3532cbaa0	db: commitlog: use fmt::streamed() to print segment before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change: * add `format_as()` for `segment` so we can use it as a fallback after upgrading to {fmt} v10 * use fmt::streamed() when formatting `segment`, this will be used the intermediate solution before {fmt} v10 after dropping `FMT_DEPRECATED_OSTREAM` macro Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#18019	2024-03-26 12:13:29 +02:00
Botond Dénes	cd9589ec78	Merge 'test.py: Sanitize test list creation' from Pavel Emelyanov To create the list of tests to run there's a loop that fist collects all tests from suits, then filters the list in two ways -- excludes opt-out-ed lists (disabled and matching the skip pattern) or leaves there only opt-in-ed (those, specified as positional arguments). This patch keeps both list-checking code close to each other so that the intent is explicitly clear. Closes scylladb/scylladb#17981 * github.com:scylladb/scylladb: test.py: Give local variable meaningful name test.py: Sanitize test list creation	2024-03-26 12:09:49 +02:00
Patryk Jędrzejczak	13fecd4e36	raft topology: decommission: allow only in NORMAL mode We move the mode check so that the raft-based decommission also uses it. Without this check, it hanged after the drain operation instead of instantly failing. `test_decommission_after_drain_is_invalid` was failing because of it with the raft-based topology enabled. Fixes scylladb/scylladb#16761 Closes scylladb/scylladb#18000	2024-03-26 08:52:26 +01:00
Botond Dénes	f0ff23492f	Merge 'Sanitize topology suites' skiplists' from Pavel Emelyanov There are skip_in_<mode> lists in suite yaml that tells test.py not to run the test from it. This PR sanitizes these lists in two ways. First, to skip pytests the skip-decorators are much more convenient, e.g. because they show the reason why the test is skipped. Also, if a test wants to be opt-in-ed for some mode only, it's opt-out-ed in all other lists instead. There's run_in_<mode> list in suite for that. Closes scylladb/scylladb#17964 * github.com:scylladb/scylladb: test: Do not duplicate test name in several skip-lists test: Mark tests with skip_mode instead of suite skip-list	2024-03-26 08:24:57 +02:00
Kefu Chai	a047178fe7	utils: UUID: drop UUID::to_sstring() this function is not used anymore, and it relies on `FMT_DEPRECATED_OSTREAM` to generated `fmt::formatter` for `UUID`, and this feature is deprecated in {fmt} v9, and dropped in {fmt} v10. in this change, let's drop this member function. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-03-26 13:38:37 +08:00
Kefu Chai	1b859e484f	treewide: use fmt::to_string() to transform a UUID to std::string without `FMT_DEPRECATED_OSTREAM` macro, `UUID::to_sstring()` is implemented using its `fmt::formatter`, which is not available at the end of this header file where `UUID` is defined. at this moment, we still use `FMT_DEPRECATED_OSTREAM` and {fmt} v9, so we can still use `UUID::to_sstring()`, but in {fmt} v10, we cannot. so, in this change, we change all callers of `UUID::to_sstring()` to `fmt::to_string()`, so that we don't depend on `FMT_DEPRECATED_OSTREAM` and {fmt} v9 anymore. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-03-26 13:38:37 +08:00
Wojciech Mitros	9789a3dc7c	mv: keep semaphore units alive until the end of a remote view update When a view update has both a local and remote target endpoint, it extends the lifetime of its memory tracking semaphore units only until the end of the local update, while the resources are actually used until the remote update finishes. This patch changes the semaphore transferring so that in case of both local and remote endpoints, both view updates share the units, causing them to be released only after the update that takes longer finishes. Fixes #17890 Closes scylladb/scylladb#17891	2024-03-25 19:43:58 +02:00
Tzach Livyatan	6702ba3664	Docs: Add link from migration tools page to nodetool refresh load and stream Closes scylladb/scylladb#18006	2024-03-25 17:47:05 +02:00
Botond Dénes	1ea7b408db	tools/scylla-nodetool: implement the sstableinfo command	2024-03-25 11:29:30 -04:00
Botond Dénes	50da93b9c8	tools/scylla-nodetool: implement the getsstables command	2024-03-25 11:29:30 -04:00
Botond Dénes	f51061b198	tools/scylla-nodetool: move get_ks_cfs() to the top of the file So it can be used by all commands.	2024-03-25 11:29:30 -04:00
Botond Dénes	4ff88b848c	test/nodetool: rest_api_mock.py: add expected_requests context manager So tests and fixtures can use `with expected_requests():` and have cleanup be taken care for them. I just discovered that some tests do not clean up after themselves and when running all tests in a certain order, this causes unrelated tests to fail. Fix by using the context everywhere, getting guaranteed cleanup after each test.	2024-03-25 11:29:30 -04:00
Petr Gusev	7c84fc527b	test_invalid_user_type_statements: increase raft timeout The test creates ut4 with a lot of fields, this may take a while in debug builds, to avoid raft operation timeout set the threshold to some big value. The error injector is disabled in release builds, so this settings won't be applied to them. This shouldn't be a problem since release builds are fast enough, even on arm. Fixes scylladb/scylladb#17987 Closes scylladb/scylladb#17997	2024-03-25 14:52:16 +01:00
Ferenc Szili	8bb7a18de2	test/cql-pytest: add --omit-scylla-output to Cassandra test runs Currently, the tests in test/cql-pytest can be run against both ScyllaDB and Cassandra. Running the test for either will first output the test results, and subsequently print the stdout output of the process under test. Using the command line option --omit-scylla-output it is possible to disable this print for Scylla, but it is not possible for tests run against Cassandra. This change adds the option to suppress output for Cassandra tests, too. By default, the stdout of the Cassandra run will still be printed after the test results, but this can now be disabled with --omit-scylla-output Closes scylladb/scylladb#17996	2024-03-25 15:14:45 +02:00
Pavel Emelyanov	16343b3edc	test: Do not duplicate test name in several skip-lists Some tests are only run in dev mode for some reason. For such tests there's run_in_dev list, no need in putting it in all the non-dev skip_in_... ones. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-25 14:56:37 +03:00
Pavel Emelyanov	90dfcec86b	test: Mark tests with skip_mode instead of suite skip-list There are many tests that are skipped in release mode becuase they rely on error-injection machinery which doesn't work in release mode. Most of those tests are listed in suite's skip_in_release, but it's not very handy, mainly because it's not clear why the test is there. The skip_mode decoration is much more convenient. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-25 14:56:37 +03:00
Pavel Emelyanov	2c90aeb5ee	test.py: Give local variable meaningful name Rename t to testname as it's more informative Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-25 14:53:48 +03:00
Pavel Emelyanov	b2f5b63aaa	test.py: Sanitize test list creation To create the list of tests to run there's a loop that fist collects all tests from suits, then filters the list in two ways -- excludes opt-out-ed lists (disabled and matching the skip pattern) or leaves there only opt-in-ed (those, specified as positional arguments). This patch keeps both list-checking code close to each other so that the intent is explicitly clear. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-25 14:53:20 +03:00
Kamil Braun	69bf962522	Merge 'allow changing snitch with topology over raft' from Gleb Fixes scylladb/scylladb#17513 * 'gleb/raft-snitch-change-v3' of github.com:scylladb/scylla-dev: doc: amend snitch changing procedure to work with raft test: add test to check that snitch change takes effect. raft topology: update rack/dc info in topology state on reboot if changed	2024-03-25 10:41:39 +01:00
Gleb Natapov	3b272c5650	doc: amend snitch changing procedure to work with raft To change snitch with raft all nodes need to be started simultaneously since each node will try to update its state in the raft and for that quorum is required.	2024-03-25 11:31:30 +02:00
Beni Peled	eecfd164ff	Remove docs-amplify-enhanced github-workflow Since we implemented the CI-Docs on pkg, there is no need for this workflow Closes scylladb/scylladb#17908	2024-03-25 11:30:06 +02:00
Kefu Chai	e97ae6b0de	raft: server: print pointee of `server_impl::_fsm` before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, instead of printing the `unique_ptr` instance, we print the pointee of it. since `server_impl` uses pimpl paradigm, `_fsm` is always valid after `server_impl::start()`, we can always deference it without checking for null. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17953	2024-03-25 11:20:34 +02:00
Botond Dénes	ff421168d0	Update tools/jmx submodule * tools/jmx 3257897a...53696b13 (1): > dist/debian: do not use substvar of ${shlib:Depends}	2024-03-25 11:16:25 +02:00
Gleb Natapov	d7adf26a56	test: add test to check that snitch change takes effect. The test creates two node cluster with default snitch (SimpleSnitch) and checks that dc and rack names are as expected. Then it changes the config to use GossipingPropertyFileSnitch with different names, restart nodes and check that now peers table has new names.	2024-03-25 10:41:49 +02:00
Kefu Chai	4eabf8b617	topology_coordinator: add fmt::formatter for wait_for_ip_timeout before this change, we rely on the default-generated fmt::formatter created from operator<<. but this depends on the `FMT_DEPRECATED_OSTREAM` macro which is not respected in {fmt} v10. this change addresses the formatting with fmtlib < 10, and without `FMT_DEPRECATED_OSTREAM` defined. please note, in {fmt} v10 and up, it defines formatter for classes derived from `std::exception`, so our formatter is only added when compiled with {fmt} < 10. in this change, `fmt::formatter<service::wait_for_ip_timeout>` is added for backward compatibility with {fmt} < 10. Refs scylladb#13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17955	2024-03-25 10:39:38 +02:00
Kefu Chai	5d59dd585f	configure.py: always rebuild SCYLLA-{PRODUCT,VERSION,RELEASE}-FILE before this change, SCYLLA-{PRODUCT,VERSION,RELEASE}-FILE is generated at the first run of `configure.py`, once these files are around, they are not updated despite that `SCYLLA_VERSION_GEN` does not generate them as long as the release string retrieved from git sha1 is identical the one stored in `SCYLLA-RELEASE-FILE`, because we don't rerun `SCYLLA_VERSION_GEN` at all. but the pain is, when performing incremental build, like other built artifacts, these generated files stay with the build directory, so even if the sha1 of the workspace changes, the SCYLLA-RELEASE-FILE keeps the same -- it still contains the original git sha1 when it was created. this could leads to confusion if developer or even our CI perform incremental build using the same workspace and build directory, as the built scylla executables always report the same version number. in this change, we always rebuilt the said SCYLLA-{PRODUCT,VERSION,RELEASE}-FILE files, and instruct ninja to re-stat the output files, see https://ninja-build.org/manual.html#ref_rule, in order to avoid unnecessary rebuild. so the downside is that `SCYLLA_VERSION_GEN` is executed every time we run `ninja` even if all targets are updated. but the upside is that the release number reported by scylla is accurate even if we perform incremental build. also, since we encode the product, version and release stored in the above files in the generated `build.ninja` file, in this change, these three files are added as dependencies of `build.ninja`, so that this file is regenerated if any of them is newer than `build.ninja`. Fixes #8255 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17974	2024-03-25 10:29:42 +02:00
Kefu Chai	5bc6d83f3b	build: cmake: always rebuild SCYLLA-{PRODUCT,VERSION,RELEASE}-FILE before this change, SCYLLA-{PRODUCT,VERSION,RELEASE}-FILE is generated when CMake generate `build.ninja` for the first time, once these files are around, they are not updated anymore. despite that `SCYLLA_VERSION_GEN` does not generate them as long as the release string retrieved from git sha1 is identical the one stored in `SCYLLA-RELEASE-FILE`, because we don't rerun `SCYLLA_VERSION_GEN` at all. but the pain is, when performing incremental build, like other built artifacts, these generated files stay with the build directory, so even if the sha1 of the workspace changes, the SCYLLA-RELEASE-FILE keeps the same -- it still contains the original git sha1 when it was created. this could leads to confusion if developer or even our CI perform incremental build using the same workspace and build directory, as the built scylla executables always report the same version number. in this change, we always rebuilt the said SCYLLA-{PRODUCT,VERSION,RELEASE}-FILE files, and instruct CMake to regenerate `build.ninja` if any of these files is updated. Fixes #17975 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17983	2024-03-25 10:28:28 +02:00
Kefu Chai	0eb990fbf6	.github: skip "raison" when running codespell workflow codespell workflow checks for misspellings to identify common mispellings. it considers "raison" in "raison d'etre" (the accent mark over "e" is removed , so the commit message can be encoded in ASCII), to the misspelling of "reason" or "raisin". apparently, the dictionary it uses does not include les mots francais les plus utilises. so, in this change, let's ignore "raison" for this very use case, before we start the l10n support of the document. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17985	2024-03-25 09:51:12 +02:00
Kefu Chai	0713c324d4	cql3: provide fmt::formatter for cql3_type::raw only for {fmt} < 10 since we already have `format_as()` for `cql3_type::raw`, there is no need to provide `cql3_type::raw` if the tree is compiled with {fmt} >= 10, otherwise compiler is not able to figure out which one to match, see the errror at the end of this commit message. so, in this change, we only provide the specialized `fmt::formatter` for `cql3_type::raw` when {fmt} < 10. this should address the FTBFS with {fmt} >= 10. ``` /usr/lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/type_traits:1040:25: error: ambiguous partial specializations of 'formatter<cql3::cql3_type::raw>' 1040 \| = __bool_constant<__is_constructible(_Tp, _Args...)>; \| ^ /usr/lib/gcc/x86_64-redhat-linux/13/../../../../include/c++/13/type_traits:1046:16: note: in instantiation of template type alias '__is_constructible_impl' requested here 1046 \| : public __is_constructible_impl<_Tp, _Args...> \| ^ /usr/include/fmt/core.h:1420:13: note: in instantiation of template class 'std::is_constructible<fmt::formatter<cql3::cql3_type::raw>>' requested here 1420 \| !has_formatter<T, Context>::value))> \| ^ /usr/include/fmt/core.h:1421:22: note: while substituting prior template arguments into non-type template parameter [with T = cql3::cql3_type::raw] 1421 \| FMT_CONSTEXPR auto map(const T&) -> unformattable_pointer { \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1422 \| return {}; \| ~~~~~~~~~~ 1423 \| } \| ~ ``` Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17986	2024-03-25 09:49:40 +02:00
Yaron Kaikov	cb2c69a3f7	github: mergify: Add Ref to original PR When openning a backport PR, adding a reference to the original PR. This will be used later for updating the original PR/issue once the backport is done (with different label) Closes scylladb/scylladb#17973	2024-03-25 08:12:47 +02:00
Raphael S. Carvalho	6bdb456fad	sstables_loader: Fix loader when write selector is previous during tablet migration The loader is writing to pending replica even when write selector is set to previous. If migration is reverted, then the writes won't be rolled back as it assumes pending replicas weren't written to yet. That can cause data resurrection if tablet is later migrated back into the same replica. NOTE: write selector is handled correctly when set to next, because get_natural_endpoints() will return the next replica set, and none of the replicas will be considered leaving. And of course, selector set to both is also handled correctly. Fixes #17892. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#17902	2024-03-24 01:20:50 +01:00
Kamil Braun	230f23004b	Revert "test.py: adjust the test for topology upgrade to write to and read from CDC tables" This reverts commit `b4144d14c6`. The test is flaky and blocks next promotions.	2024-03-22 17:25:04 +01:00
Petr Gusev	2a5f5d1948	test_fencing: fix flakiness To cause the stale topology exception the test reads the version from the last bootstrapped host and assigns its decremented value to version and fence_version fields of system.topology. The test assumes that version == fence_version here, if version is greater than fence_version we won't get state topology exception in this setup. Tablet balancer can break this -- it may increment the version after the last node is bootstrapped. Fix this by disabling the tablet balancer earlier. fixes scylladb/scylladb#17807 Closes scylladb/scylladb#17940	2024-03-22 12:49:13 +01:00
Piotr Dulikowski	f23f8f81bf	Merge 'Raft-based service levels' from Michał Jadwiszczak This patch introduces raft-based service levels. The difference to the current method of working is: - service levels are stored in `system.service_levels_v2` - reads are executed with `LOCAL_ONE` - writes are done via raft group0 operation Service levels are migrated to v2 in topology upgrade. After the service levels are migrated, `key: service_level_v2_status; value: data_migrated` is written to `system.scylla_local` table. If this row is present, raft data accessor is created from the beginning and it handles recovery mode procedure (service levels will be read from v2 table even if consistent topology is disabled then) Fixes #17926 Closes scylladb/scylladb#16585 * github.com:scylladb/scylladb: test: test service levels v2 works in recovery mode test: add test for service levels migration test: add test for service levels snapshot test:topology: extract `trigger_snapshot` to utils main: create raft dda if sl data was migrated service:qos: store information about sl data migration service:qos: service levels migration main: assign standard service level DDA before starting group0 service:qos: fix `is_v2()` method service:qos: add a method to upgrade data accessor test: add unit_test_raft_service_levels_accessor service:storage_service: add support for service levels raft snapshot service:qos: add abort_source for group0 operations service:qos: raft service level distributed data accessor service:qos: use group0_guard in data accessor cql3:statements: run service level statements on shard0 with raft guard test: fix overrides in unit_test_service_levels_accessor service:qos: fix indentation service:qos: coroutinize some of the methods db:system_keyspace: add `SERVICE_LEVELS_V2` table service:qos: extract common service levels' table functions	2024-03-22 11:51:53 +01:00
Kamil Braun	9979adb670	Merge 'topology_coordinator: do not clear unpublished CDC generation's data' from Patryk Jędrzejczak In this PR, we ensure unpublished CDC generation's data is never removed, which was theoretically possible. If it happened, it could cause problems. CDC generation publisher would then try to publish the generation with its data removed. In particular, the precondition of calling `_sys_ks.read_cdc_generation` wouldn't be satisfied. We also add a test that passes only after the fix. However, this test needs to block execution of the CDC generation publisher's loop twice. Currently, error injections with handlers do not allow it because handlers always share received messages. Apart from the first created handler, all handlers would be instantly unblocked by a message from the past that has already unblocked the first handler. This seems like a general limitation that could cause problems in the future, so in this PR, we extend injections with handlers to solve it once and for all. We add the `share_messages` parameter to the `inject` (with handler) function. Depending on its value, handlers will share messages (as before) or not. Fixes scylladb/scylladb#17497 Closes scylladb/scylladb#17934 * github.com:scylladb/scylladb: topology_coordinator: clean_obsolete_cdc_generations: fix log topology_coordinator: do not clear unpublished CDC generation's data topology_coordinator: cdc_generation_publisher_fiber injection: make handlers share messages error_injection: allow injection handlers to not share messages	2024-03-22 11:20:26 +01:00
Kamil Braun	4359a1b460	Merge 'raft timeouts: better handling of lost quorum' from Petr Gusev In this PR we add timeouts support to raft groups registry. We introduce the `raft_server_with_timeouts` class, which wraps the `raft::server` add exposes its interface with additional `raft_timeout` parameter. If it's set, the wrapper cancels the `abort_source` after certain amount of time. The value of the timeout can be specified either in the `raft_timeout` parameter, or the default value can be set in `the raft_server_with_timeouts` class constructor. The `raft_group_registry` interface is extended with `group0_with_timeouts()` method. It returns an instance of `raft_server_with_timeouts` for group0 raft server. The timeout value for it is configured in `create_server_for_group0`. It's one minute by default and can be overridden for tests with `group0-raft-op-timeout-in-ms` parameter. The new api allows the client to decide whether to use timeouts or not. In this PR we are reviewing all the group0 call sites and add `raft_timeout` if that makes sense. The general principle is that if the code is handling a client request and the client expects a potential error, we use timeouts. We don't use timeouts for background fibers (such as topology coordinator), since they wouldn't add much value. The only thing the background fiber can do with a timeout is to retry, and this will have the same end effect as not having a timeout at all. Fixes scylladb/scylladb#16604 Closes scylladb/scylladb#17590 * github.com:scylladb/scylladb: migration_manager: use raft_timeout{} storage_service::join_node_response_handler: use raft_timeout{} storage_service::start_upgrade_to_raft_topology: use raft_timeout{} storage_service::set_tablet_balancing_enabled: use raft_timeout{} storage_service::move_tablet: use raft_timeout{} raft_check_and_repair_cdc_streams: use raft_timeout{} raft_timeout: test that node operations fail properly raft_rebuild: use raft_timeout{} do_cluster_cleanup: use raft_timeout{} raft_initialize_discovery_leader: use raft_timeout{} update_topology_with_local_metadata: use with_timeout{} raft_decommission: use raft_timeout{} raft_removenode: use raft_timeout{} join_node_request_handler: add raft_timeout to make_nonvoters and add_entry raft_group0: make_raft_config_nonvoter: add raft_timeout parameter raft_group0: make_raft_config_nonvoter: add abort_source parameter manager_client: server_add with start=false shouldn't call driver_connect scylla_cluster: add seeds parameter to the add_server and servers_add raft_server_with_timeouts: report the lost quorum join_node_request_handler: add raft_timeout{} for start_operation skip_mode: add platform_key auth: use raft_timeout{} raft_group0_client: add raft_timeout parameter raft_group_registry: add group0_with_timeouts utils: add composite_abort_source.hh error_injection: move api registration to set_server_init error_injection: add inject_parameter method error_injection: move injection_name string into injection_shared_data error_injection: pass injection parameters at startup	2024-03-22 10:45:33 +01:00
Botond Dénes	f02baef871	Merge 'test/lib: sstable::test_env consolidate and reduce header footprint' from Avi Kivity Reduce the sprawl of sstables::test_env in .cc and .hh files, to ease maintenance and reduce recompilations. Closes scylladb/scylladb#17965 * github.com:scylladb/scylladb: test: sstables::test_env: complete pimplification test/lib: test_env: move test_env::reusable_sst() to test_services.cc	2024-03-22 11:26:12 +02:00
Botond Dénes	8b2856339a	Merge 'github: sync-labels: use more descriptive name for workflow' from Kefu Chai * rename `sync_labels.yaml` to `sync-labels.yaml` * use more descrptive name for workflow Closes scylladb/scylladb#17971 * github.com:scylladb/scylladb: github: sync-labels: use more descriptive name for workflow github: sync_labels: rename sync_labels to sync-labels	2024-03-22 10:01:56 +02:00
David Garcia	0375faa6aa	docs: add experimental tag Closes scylladb/scylladb#17633	2024-03-22 09:53:30 +02:00
Patryk Wrobel	28ed20d65e	scylla-nodetool: adjust effective ownership handling When a keyspace uses tablets, then effective ownership can be obtained per table. If the user passes only a keyspace, then /storage_service/ownership/{keyspace} returns an error. This change: - adds an additional positional parameter to 'status' command that allows a user to query status for table in a keyspace - makes usage of /storage_service/ownership/{keyspace} optional to avoid errors when user tries to obtain effective ownership of a keyspace that uses tablets - implements new frontend tests in 'test_status.py' that verify the new logic Refs: scylladb#17405 Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com> Closes scylladb/scylladb#17827	2024-03-22 09:51:57 +02:00
Yaron Kaikov	407d25e47b	[mergify] delete backport branch after merge Since those branches clutter the branch search UI and we don't need them after merging Closes scylladb/scylladb#17961	2024-03-22 09:51:22 +02:00

1 2 3 4 5 ...

42041 Commits