scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-23 10:00:35 +00:00

Author	SHA1	Message	Date
Aleksandra Martyniuk	7a7e287d8c	compaction: add reshard_sstables_compaction_task_impl Add task manager's task covering resharding compaction. A struct and some functions are moved from replica/distributed_loader.cc to compaction/task_manager_module.cc.	2023-07-19 17:15:40 +02:00
Aleksandra Martyniuk	e486f4eba6	compaction: create resharding_compaction_task_impl resharding_compaction_task_impl serves as a base class of all concrete resharding compaction task classes.	2023-07-19 10:41:35 +02:00
Kamil Braun	bfaac5192a	gossiper: call `on_remove` subscriptions in the foreground in `remove_endpoint` `gossiper::remove_endpoint` performs `on_remove` callbacks for all endpoint change subscribers. This was done in the background (with a discarded future) due the following reason: ``` // We can not run on_remove callbacks here because on_remove in // storage_service might take the gossiper::timer_callback_lock ``` however, `gossiper::timer_callback_lock` no longer exists, it was removed in `19e8c14`. Today it is safe to perform the `storage_service::on_remove` callback in the foreground -- it's only taking the token metadata lock, which is also taken and then released earlier by the same fiber that calls `remove_endpoint` (i.e. `storage_service::handle_state_normal`). Furthermore, we want to perform it in the foreground. First, there already was a comment saying: ``` // do subscribers first so anything in the subscriber that depends on gossiper state won't get confused ``` it's not too precise, but it argues that subscriber callbacks should be serialized with the rest of `remove_endpoint`, not done concurrently with it. Second, we now have a concrete reason to do them in the foreground. In issue #14646 we observed that the subcriber callbacks are racing with the bootstrap procedure. Depending on scheduling order, if `storage_service::on_remove` is called too late, a bootstrapping node may try to wait for a node that was earlier replaced to become UP which is incorrect. By putting the `on_remove` call into foreground of `remove_endpoint`, we ensure that a node that was replaced earlier will not be included in the set of nodes that the bootstrapping node waits for (because `storage_service::on_remove` will clear it from `token_metadata` which we use to calculate this set of nodes). We also get rid of an unnecessary `seastar::async` call. Fixes #14646 Closes #14741	2023-07-18 21:29:29 +02:00
Pavel Emelyanov	8bc42f54d4	Merge 'feature_service: handle deprecated features correctly in feature check' from Piotr Dulikowski The feature check in `enable_features_on_startup` loads the list of features that were enabled previously, goes over every one of them and checks whether each feature is considered supported and whether there is a corresponding `gms::feature` object for it (i.e. the feature is "registered"). The second part of the check is unnecessary and wrong. A feature can be marked as supported but its `gms::feature` object not be present anymore: after a feature is supported for long enough (i.e. we only support upgrades from versions that support the feature), we can consider such a feature to be deprecated. When a feature is deprecated, its `gms::feature` object is removed and the feature is always considered enabled which allows to remove some legacy code. We still consider this feature to be supported and advertise it in gossip, for the sake of the old nodes which, even though they always support the feature, they still check whether other nodes support it. The problem with the check as it is now is that it disallows moving features to the disabled list. If one tries to do it, they will find out that upgrading the node to the new version does not work: `enable_features_on_startup` will load the feature, notice that it is not "registered" (there is no `gms::feature` object for it) and fail to boot. This commit fixes the problem by modifying `enable_features_on_startup` not to look at the registered features list at all. In addition to this, some other small cleanups are performed: - "LARGE_COLLECTION_DETECTION" is removed from the deprecated features list. For some reason, it was put there when the feature was being introduced. It does not break anything because there is a `gms::feature` object for it, but it's slightly confusing and therefore is removed. - The comment in `supported_feature_set` that invites developers to add features there as they are introduced is removed. It is no longer necessary to do so because registered features are put there automatically. Deprecated features should still be put there, as indicated as another comment. Fortunately, this issue does not break any upgrades as of now - since we added enabled cluster feature persisting, no features were deprecated, and we only add registered features to the persisted feature list. An error injection and a regression test is added. Closes #14701 * github.com:scylladb/scylladb: topology_custom: add deprecated features test feature_service: add error injection for deprecated cluster feature feature_service: move error injection check to helper function feature_service: handle deprecated features correctly in feature check	2023-07-18 21:01:48 +03:00
Kamil Braun	6f22ed9145	Merge 'raft: move group0_state_machine::merger to its own header and add unit test for it' from Mikołaj Grzebieluch Move `merger` to its own header file. Leave the logic of applying commands to `group0_state_machine`. Remove `group0_state_machine` dependencies from `merger` to make it an independent module. Add a test that checks if `group0_state_machine_merger` preserves timeuuid monotonicity. `last_id()` should be equal to the largest timeuuid, based on its timestamps. This test combines two commands in the reverse order of their timeuuids. The timeuuids yield different results when compared in both timeuuid order and uuid order. Consequently, the resulting command should have a more recent timeuuid. Fixes #14568 Closes #14682 * github.com:scylladb/scylladb: raft: group0_state_machine_merger: add test for timeuuid ordering raft: group0_state_machine: extract merger to its own header	2023-07-18 17:43:50 +02:00
Kamil Braun	56c91473f2	Merge 'storage_proxy: silence abort_requested_exception on reads and writes' from Patryk Jędrzejczak Fixes #10447 This issue is an expected behavior. However, `abort_requested_exception` is not handled properly. -- Why this issue appeared 1. The node is drained. 2. `migration_manager::drain` is called and executes `_as.request_abort();`. 3. The coordinator sends read RPCs to the drained replica. On the replica side, `storage_proxy::handle_read` calls `migration_manager::get_schema_for_read`, which is defined like this: ```cpp future<schema_ptr> migration_manager::get_schema_for_write(/* ... /) { if (_as.abort_requested()) { co_return coroutine::exception(std::make_exception_ptr(abort_requested_exception())); } / ... / ``` So, `abort_requested_exception` is thrown. 4. RPC doesn't preserve information about its type, and it is converted to a string containing its error message. 5. It is rethrown as `std::runtime_error` on the coordinator side, and `abstract_resolve_reader::error()` logs information about it. However, we don't want to report `abort_requested_exception` there. This exception should be catched and ignored: ```cpp void error(/ ... /) { / ... / else if (try_catch<abort_requested_exception>(eptr)) { // do not report aborts, they are trigerred by shutdown or timeouts } / ... / ``` -- Proposed solution To fix this issue, we can add `abort_requested_exception` to `replica::exception_variant` and make sure that if it is thrown by `migration_manager::get_schema_for_write`, `storage_proxy::handle_read` correctly encodes it. Thanks to this change, `abstract_read_resolver::error` can correctly handle `abort_requested_exception` thrown on the replica side by not reporting it. -- Side effect of the proposed solution If the replica supports it, the coordinator doesn't, and all nodes support `feature_service::typed_errors_in_read_rpc`, the coordinator will fail to decode `abort_requested_exception` and it will be decoded to `unknown_exception`. It will still be rethrown as `std::runtime_error`, however the message will change from abort requested* to unknown exception. -- Another issue Moreover, `handle_write` reports abort requests for the same reason. This also floods the logs (this time on the replica side) for the same reason. I don't think it is intended, so I've changed it too. This change is in the last commit. Closes #14681 * github.com:scylladb/scylladb: service: storage_proxy: do not report abort requests in handle_write service: storage_proxy: encode abort_requested_exception in handle_read service: storage_proxy: refactor encode_replica_exception_for_rpc replica: add abort_requested_exception to exception_variant	2023-07-18 17:04:05 +02:00
Nadav Har'El	4ce46a998a	cql-pytest: translate Cassandra's tests for BATCH operations This is a translation of Cassandra's CQL unit test source file BatchTest.java into our cql-pytest framework. This test file an old (2014) and small test file, with only a few minimal testing of mostly error paths in batch statements. All test tests pass in both Cassandra and Scylla. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #14733	2023-07-18 17:01:18 +03:00
Raphael S. Carvalho	da18a9badf	Fix test.py with compaction groups test.py with --x-log2-compaction-groups option rotted a little bit. Some boost tests added later didn't use the correct header which parses the option or they didn't adjust suite.yaml. Perhaps it's time to set up a weekly (or bi-weekly) job to verify there are no regressions with it. It's important as it stresses the data plane for tablets reusing the existing tests available. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #14732	2023-07-18 16:57:11 +03:00
Botond Dénes	7d5cca1958	Merge 'Regular compaction task' from Aleksandra Martyniuk Task manager's tasks covering regular compaction. Uses multiple inheritance on already existing regular_compaction_task_executor to keep track of the operation with task manager. Closes #14377 * github.com:scylladb/scylladb: test: add regular compaction task test compaction: turn regular_compaction_task_executor into regular_compaction_task_impl compaction: add compaction_manager::perform_compaction method test: modify sstable_compaction_test.cc compaction: add regular_compaction_task_impl compaction: switch state after compaction is done	2023-07-18 16:52:53 +03:00
Kefu Chai	4661671220	s3/test: do not keep the tempdir forever by default, up to 3 temporary directories are kept by pytest. but we run only a single time for each of the $TMPDIR. per our recent observation, it takes a lot more time for jenkins to scan the tempdir if we use it for scylla's rundir. so, to alleviate this symptom, we just keep up to one failed session in the tempdir. if the test passes, the tempdir created by pytest will be nuked. normally it is located at scylladb/testlog/${mode}/pytest-of-$(whoami). see also https://docs.pytest.org/en/7.3.x/reference/reference.html#confval-tmp_path_retention_policy Refs #14690 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14735 [xemul: Withdrawing from PR's comments object_store is the only test which is using tmpdir fixture starts / stops scylla by itself and put the rundir of scylla in its own tmpdir we don't register the step of cleaning up [the temp dir] using the utilities provided by cql-pytest. we rely on pytest to perform the cleanup. while cql-pytest performs the cleanup using a global registry. ]	2023-07-18 16:49:25 +03:00
Kamil Braun	69e22de54d	Merge 'minor test/pylib type fixes' from Alecco Some minor fixes reported by `mypy`. Closes #14693 * github.com:scylladb/scylladb: test/pylib: fix function attribute test/pylib: check cmd is defined before using it test/pylib: fix return type hint test/pylib: remove redundant method	2023-07-18 15:17:51 +02:00
Avi Kivity	a51fdadfed	Merge 'treewide: remove #includes not use directly' from Kefu Chai for faster build times and clear inter-module dependencies, we should not #includes headers not directly used. instead, we should only #include the headers directly used by a certain compilation unit. in this change, the source files under "/compaction" directories are checked using clangd, which identifies the cases where we have an #include which is not directly used. all the #includes identified by clangd are removed. because some source files rely on the incorrectly included header file, those ones are updated to #include the header file they directly use. if a forward declaration suffice, the declaration is added instead. see also https://clangd.llvm.org/guides/include-cleaner#unused-include-warning Closes #14740 * github.com:scylladb/scylladb: treewide: remove #includes not use directly size_tiered_backlog_tracker: do not include remove header	2023-07-18 14:45:33 +03:00
Alejo Sanchez	8fceb7b7a0	test/pylib: fix function attribute Instead of globally hardcoding an attribute, set it in the function itself. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-07-18 13:33:46 +02:00
Alejo Sanchez	f7ee4ee7f6	test/pylib: check cmd is defined before using it Add an assert to check cmd is defined. Helps the type checker. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-07-18 13:33:46 +02:00
Alejo Sanchez	ff564583a4	test/pylib: fix return type hint Fix type hint of return when using @asynccontextmanager. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-07-18 13:33:46 +02:00
Alejo Sanchez	2194d8864b	test/pylib: remove redundant method The ManagerClient.get_cql method is defined twice. Remove one and fix the assert. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>	2023-07-18 13:33:46 +02:00
Kamil Braun	eb6202ef9c	Merge 'db: hints: add checksum to sync_point encoding' from Patryk Jędrzejczak Fixes #9405 `sync_point` API provided with incorrect sync point id might allocate crazy amount of memory and fail with `std::bad_alloc`. To fix this, we can check if the encoded sync point has been modified before decoding. We can achieve this by calculating a checksum before encoding, appending it to the encoded sync point, and compering it with a checksum calculated in `db::hints::decode` before decoding. Closes #14534 * github.com:scylladb/scylladb: db: hints: add checksum to sync point encoding db: hints: add the version_size constant	2023-07-18 13:05:10 +02:00
Kefu Chai	bab16eb30e	treewide: remove #includes not use directly for faster build times and clear inter-module dependencies, we should not #includes headers not directly used. instead, we should only #include the headers directly used by a certain compilation unit. in this change, the source files under "/compaction" directories are checked using clangd, which identifies the cases where we have an #include which is not directly used. all the #includes identified by clangd are removed. because some source files rely on the incorrectly included header file, those ones are updated to #include the header file they directly use. if a forward declaration suffice, the declaration is added instead. see also https://clangd.llvm.org/guides/include-cleaner#unused-include-warning Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-07-18 17:36:31 +08:00
Kefu Chai	58302ab145	size_tiered_backlog_tracker: do not include remove header according to cppreference, > <ctgmath> is deprecated in C++17 and removed in C++20 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-07-18 17:36:31 +08:00
Michał Jadwiszczak	62ced66702	schema: add scylla specific options to schema description Add `paxos_grace_seconds`, `tombstone_gc`, `cdc` and `synchronous_updates` options to schema description. Fixes: #12389 Fixes: scylladb/scylla-enterprise#2979 Closes #14275	2023-07-18 11:16:19 +03:00
Botond Dénes	21ff6efd74	test/boost/view_build_test: improve test_view_update_generator_register_semaphore_unit_leak By making it independent of the number of units the view update generator's registration semaphore is created with. We want to increase this number significantly and that would destabilize this test significantly. To prevent this, detach the test from the number of units completely, while stil preserving the original intent behind it, as best as it could be determined. Closes #14727	2023-07-18 09:18:28 +03:00
Alejo Sanchez	13e31eaeca	test.py: show mode and suite name when listing tests For --list, show also mode and suite name. Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com> Closes #14729	2023-07-18 09:06:47 +03:00
Botond Dénes	b3cb611be7	Merge 'treewide: enable -Wsign-compare and address the warnings from this option' from Kefu Chai in order to identify the problems caused by integer type promotion when comparing unsigned and signed integers, in this series, we - address the warnings raised by `-Wsign-compare` compiler option - add `-Wsign-compare` compiler option to the building systems Closes #14652 * github.com:scylladb/scylladb: treewide: use unsigned variable to compare with unsigned treewide: compare signed and unsigned using std::cmp_*()	2023-07-18 09:05:30 +03:00
Botond Dénes	6961fbcec7	Merge 'Add the metrics config api' from Amnon Heiman This series is based on top of the seastar relabel config API. The series adds a REST API for the configuration, it allows to get and set it. The API is registered under the V2 prefix and uses the swagger 2.0 definition. After this series to get the current relabel-config configuration: ``` curl -X GET --header 'Accept: application/json' 'http://localhost:10000/v2/metrics-config/' ``` A set config example: ``` curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '[ \ { \ "source_labels": [ \ "__name__" \ ], \ "action": "replace", \ "target_label": "level", \ "replacement": "1", \ "regex": "io_que." \ } \ ]' 'http://localhost:10000/v2/metrics-config/' ``` This is how it looks like in the UI ![image](https://user-images.githubusercontent.com/2118079/230763730-bafcaf8b-ea6d-4a6c-a778-6271fa3b6f82.png) Closes #12670 github.com:scylladb/scylladb: api: Add the metrics API api/config: make it optional if the config API is the first to register api: Add the metrics.json Swagger file Preparing for V2 API from files	2023-07-18 07:10:31 +03:00
Botond Dénes	f03efd7ea9	Merge 'build: cmake: fix the build of some tests' from Kefu Chai this series addresses the FTBFS of tests with CMake, and also checks for the unknown parameters in `add_scylla_test()` Closes #14650 * github.com:scylladb/scylladb: build: cmake: build SEASTAR tests as SEASTAR tests build: cmake: error out if found unknown keywords build: cmake: link tests against necessary libraries	2023-07-18 06:51:40 +03:00
Kefu Chai	4c1a26c99f	compaction_manager: sort sstables when compaction is enabled before this change, we sort sstables with compaction disabled, when we are about to perform the compaction. but the idea of of guarding the getting and registering as a transaction is to prevent other compaction to mutate the sstables' state and cause the inconsistency. but since the state is tracked on per-sstable basis, and is not related to the order in which they are processed by a certain compaction task. we don't need to guard the "sort()" with this mutual exclusive lock. for better readability, and probably better performance, let's move the sort out of the lock. and take this opportunity to use `std::ranges::sort()` for more concise code. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes #14699	2023-07-18 06:40:43 +03:00
Kefu Chai	fa3129fa29	treewide: use unsigned variable to compare with unsigned some times we initialize a loop variable like auto i = 0; or int i = 0; but since the type of `0` is `int`, what we get is a variable of `int` type, but later we compare it with an unsigned number, if we compile the source code with `-Werror=sign-compare` option, the compiler would warn at seeing this. in general, this is a false alarm, as we are not likely to have a wrong comparison result here. but in order to prevent issues due to the integer promotion for comparison in other places. and to prepare for enabling `-Werror=sign-compare`. let's use unsigned to silence this warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-07-18 10:27:18 +08:00
Kefu Chai	3129ae3c8c	treewide: compare signed and unsigned using std::cmp_() when comparing signed and unsigned numbers, the compiler promotes the signed number to coomon type -- in this case, the unsigned type, so they can be compared. but sometimes, it matters. and after the promotion, the comparison yields the wrong result. this can be manifested using a short sample like: ``` int main(int argc, char argv) { int x = -1; unsigned y = 2; fmt::print("{}\n", x < y); return 0; } ``` this error can be identified by `-Werror=sign-compare`, but before enabling this compiling option. let's use `std::cmp_()` to compare them. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2023-07-18 10:27:18 +08:00
Amnon Heiman	123dd44c21	api: Add the metrics API This patch adds a metrics API implementation. The API supports get and set the metric relabel config. Seastar supports metrics relabeling in runtime, following Prometheus relabel_config. Based on metrics and label name, a user can add or remove labels, disable a metric and set the skip_when_empty flag. The metrics-config API support such configuration to be done using the RestFull API. As it's a new API it is placed under the V2 path. After this patch the following API will be available 'http://localhost:10000/v2/metrics-config/' GET/POST. For example: To get the current config: ``` curl -X GET --header 'Accept: application/json' 'http://localhost:10000/v2/metrics-config/' ``` To set a config: ``` curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '[ \ { \ "source_labels": [ \ "__name__" \ ], \ "action": "replace", \ "target_label": "level", \ "replacement": "1", \ "regex": "io_que.*" \ } \ ]' 'http://localhost:10000/v2/metrics-config/' ```	2023-07-17 17:09:36 +03:00
Amnon Heiman	eeac846ea7	api/config: make it optional if the config API is the first to register Until now, only the configuration API was part of the V2 API. Now, when other APIs are added, it is possible that another API would be the first to register. The first to register API is different in the sense that it does not have a leading ',' to it. This patch adds an option to mark the config API if it's the first.	2023-07-17 17:09:35 +03:00
Amnon Heiman	d694a42745	api: Add the metrics.json Swagger file This patch adds the swagger definition for the metrics API. Currently, the API defines a get and set of the metric_relabel_config.	2023-07-17 17:09:35 +03:00
Amnon Heiman	9e0ec3afba	Preparing for V2 API from files This patch changes the base path of the V2 of the API to be '/'. That means that the v2 prefix will be part of the path definition. Currently, it only affect the config API that is created from code. The motivation for the change is for Swagger definitions that are read from a file. Currently, when using the swagger-ui with a doc path set to http://localhost:10000/v2 and reading the Swagger from a file swagger ui will concatenate the path and look for http://localhost:10000/v2/v2/{path} Instead, the base path is now '/' and the /v2 prefix will be added by each endpoint definition. From the user perspective, there is no change in current functionality. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2023-07-17 17:09:35 +03:00
Patryk Jędrzejczak	02618831ef	db: hints: add checksum to sync point encoding sync point API provided with incorrect sync point id might allocate crazy amount of memory and fail with std::bad_alloc. To fix this, we can check if the encoded sync point has been modified before decoding. We can achieve this by calculating a checksum before encoding, appending it to the encoded sync point, and compering it with a checksum calculated in db::hints::decode before decoding.	2023-07-17 16:05:07 +02:00
Patryk Jędrzejczak	0a424e1760	db: hints: add the version_size constant The next commit changes the format of encoding sync points to V2. The new format appends the checksum to the encoded sync points and its implementation uses the checksum_size constant - the number of bytes required to store the checksum. To increase consistency and readability, we can additionally add and use the version_size constant. Definitions of sync_point::decode and sync_point::encode are slightly changed so that they don't depend on the version_size value and make implementation of the V2 format easier.	2023-07-17 16:02:18 +02:00
Aleksandra Martyniuk	7dbe624dee	test: add regular compaction task test	2023-07-17 15:54:33 +02:00
Aleksandra Martyniuk	2e87ba1879	compaction: turn regular_compaction_task_executor into regular_compaction_task_impl regular_compaction_task_executor inherits both from compaction_task_executor and regular_compaction_task_impl.	2023-07-17 15:54:33 +02:00
Aleksandra Martyniuk	e3b068be4d	compaction: add compaction_manager::perform_compaction method	2023-07-17 15:54:33 +02:00
Aleksandra Martyniuk	ab4ae6b84a	test: modify sstable_compaction_test.cc Modify sstable_compaction_test.cc so that it does not depend on how quick compaction manager stats are updated after compaction is triggered. It is required since in the following changes the context may switch before the stats are updated.	2023-07-17 15:54:33 +02:00
Aleksandra Martyniuk	9fdd130943	compaction: add regular_compaction_task_impl regular_compaction_task_impl serves as a base class of all concrete regular compaction task classes.	2023-07-17 15:54:33 +02:00
Aleksandra Martyniuk	33cb156ee3	compaction: switch state after compaction is done Compaction task executors which inherit from compaction_task_impl may stay in memory after the compaction is finished. Thus, state switch cannot happen in destructor. Switch state to none in perform_task defer.	2023-07-17 15:54:33 +02:00
Mikołaj Grzebieluch	bdf3959ae6	raft: group0_state_machine_merger: add test for timeuuid ordering This test checks if `group0_state_machine_merger` preserves timeuuid monotonicity. `last_id()` should be equal to the largest timeuuid, based on its timestamps. This test combines two commands in the reverse order of their timeuuids. The timeuuids yield different results when compared in both timeuuid order and uuid order. Consequently, the resulting command should have a more recent timeuuid. Closes #14568	2023-07-17 15:51:20 +02:00
Mikołaj Grzebieluch	96c6e0d0f7	raft: group0_state_machine: extract merger to its own header Move `merger` to its own header file. Leave the logic of applying commands to `group0_state_machine`. Remove `group0_state_machine` dependencies from `merger` to make it an independent module. Add `static` and `const` keywords to its methods signature. Change it to `class`. Add documentation. With this patch, it is easier to write unit tests for the merger.	2023-07-17 15:45:49 +02:00
Anna Stuchlik	2aa3672e5f	doc: fix the 5.2-to-5.3 upgrade guide Fixes https://github.com/scylladb/scylladb/issues/13993 This commit applies feedback from @mykaul added in https://github.com/scylladb/scylladb/pull/13960 after it was merged. In addition, I've removed the information about the Ubuntu version the images are based - the info doesn't belong here, and, it addition, it causes maintenance issues. Closes #14703	2023-07-17 15:26:33 +02:00
Patryk Jędrzejczak	7ae7be0911	locator: remove this_host_id from topology::config The `locator::topology::config::this_host_id` field is redundant in all places that use `locator::topology::config`, so we can safely remove it. Closes #14638 Closes #14723	2023-07-17 14:57:36 +02:00
Patryk Jędrzejczak	56bd9b5db3	service: storage_proxy: do not report abort requests in handle_write We don't want to report aborts in storage_proxy::handle_write, because it can be only triggered by shutdowns and timeouts. Before this change, such reports flooded logs when a drained node still received the write RPCs.	2023-07-17 12:27:36 +02:00
Patryk Jędrzejczak	f9db9f5943	service: storage_proxy: encode abort_requested_exception in handle_read storage_proxy::handle_read now makes sure that abort_requested_exception is encoded in a way that preserves its type information. This allows the coordinator to properly deserialize and handle it. Before this change, if a drained replica was still receiving the read RPCs, it would flood the coordinator's logs with std::runtime_error reports.	2023-07-17 12:27:36 +02:00
Patryk Jędrzejczak	68bd0424c2	service: storage_proxy: refactor encode_replica_exception_for_rpc To properly handle abort_requested_exception thrown from migration_manager::get_schema_for_read in storage_proxy::handle_read (we do in the next commit) we have to somehow encode and return it. The encode_replica_exception_for_rpc function is not suitable for that because it requires the SourceTuple type (of a value returned by do_query()) which we don't know when calling get_schema_for_read. We move the part of encode_replica_exception_for_rpc responsible for handling exceptions to a new function and rewrite it in a way that doesn't require the SourceTuple type. As this function fits the name encode_replica_exception_for_rpc better, we name it this way and rename the previous encode_replica_exception_for_rpc.	2023-07-17 12:27:33 +02:00
Patryk Jędrzejczak	7f83dbd9e7	test: disable raft-topology in test_remove_garbage_group0_members With Raft-topology enabled, test_remove_garbage_group0_members has been flaky when it should always fail. This has been discussed in #14614. Disabling Raft-topology in the topology suite is problematic because the initial cluster size is non-zero, so we have nodes that already use Raft-topology at the beginning of the test. Therefore, we move test_topology_remove_garbage_group0.py to the topology_custom suite. Apart from disabling Raft-topology, we have to start 4 servers instead of 1 because of the different initial cluster sizes. Closes #14692	2023-07-17 11:42:57 +02:00
Anna Stuchlik	c53bbbf1b9	doc: document nodetool checkAndRepairCdcStreams Fixes https://github.com/scylladb/scylladb/issues/13783 This commit documents the nodetool checkAndRepairCdcStreams operation, which was missing from the docs. The description is added in a new file and referenced from the nodetool operations index. Closes #14700	2023-07-17 11:41:54 +02:00
Avi Kivity	bfaac3a239	Merge 'Make replace sstables implementations exception safe' from Benny Halevy This is the first phase of providing strong exception safety guarantees by the generic `compaction_backlog_tracker::replace_sstables`. Once all compaction strategies backlog trackers' replace_sstables provide strong exception safety guarantees (i.e. they may throw an exception but must revert on error any intermediate changes they made to restore the tracker to the pre-update state). Once this series is merged and ICS replace_sstables is also made strongly exception safe (using infrastructure from size_tiered_backlog_tracker introduced here), `compaction_backlog_tracker::replace_sstables` may allow exceptions to propagate back to the caller rather than disabling the backlog tracker on errors. Closes #14104 * github.com:scylladb/scylladb: leveled_compaction_backlog_tracker: replace_sstables: provide strong exception safety guarantees time_window_backlog_tracker: replace_sstables: provide strong exception safety guarantees size_tiered_backlog_tracker: replace_sstables: provide strong exception safety guarantees size_tiered_backlog_tracker: provide static calculate_sstables_backlog_contribution size_tiered_backlog_tracker: make log4 helper static size_tiered_backlog_tracker: define struct sstables_backlog_contribution size_tiered_backlog_tracker: update_sstables: update total_bytes only if set changed compaction_backlog_tracker: replace_sstables: pass old and new sstables vectors by ref compaction_backlog_tracker: replace_sstables: add FIXME comments about strong exception safety	2023-07-17 12:32:27 +03:00

1 2 3 4 5 ...

37954 Commits