scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-28 04:06:59 +00:00

Author	SHA1	Message	Date
Andrei Chekun	7bbb8d9260	[test.py] Fix URl for failed logs directory in CI Incorrect passing of the artifacts_dir_url parameter from test.py to pytest leads to the situation when it will pass None as a string and pytest will generate incorrect URL.	2024-06-17 11:16:48 +02:00
Andrei Chekun	8d1d206aff	[test.py] Add uniqueness to the test name In CI test always executed with option --repeat=3 that leads to generate 3 test results with the same name. Junit plugin in CI cannot distinguish correctly the difference between these results. In case when we have two passes and one fail, the link to test result will sometimes be redirected to the incorrect one because the test name is the same. To fix this ReportPlugin added that will be responsible to modify the test case name during junit report generation adding to the test name mode and run id. Fixes: https://github.com/scylladb/scylladb/issues/17851 Fixes: https://github.com/scylladb/scylladb/issues/15973	2024-06-14 11:23:04 +02:00
Tomasz Grabiec	2c3f7c996f	test: pylib: Fetch all pages by default in run_async Fetching only the first page is not the intuitive behavior expected by users. This causes flakiness in some tests which generate variable amount of keys depending on execution speed and verify later that all keys were written using a single SELECT statement. When the amount of keys becomes larger than page size, the test fails. Fixes #18774 Closes scylladb/scylladb#19004	2024-06-05 18:07:24 +03:00
Patryk Jędrzejczak	388db33dec	test: util: start_writes_to_cdc_table: add FIXME to increase CL	2024-05-27 10:39:04 +02:00
Patryk Jędrzejczak	68b6e8e13e	test: util: start_writes_to_cdc_table: allow restarting with new cql This patch allows us to restart writing (to the same table with CDC enabled) with a new CQL session. It is useful when we want to continue writing after closing the first CQL session, which happens during the `reconnect_driver` call. We must stop writing before calling `reconnect_driver`. If a write started just before the first CQL session was closed, it would time out on the client. We rename `finish_and_verify` - `stop_and_verify` is a better name after introducing `restart`.	2024-05-27 10:39:04 +02:00
Michał Chojnowski	de798775fd	test: test_coordinator_queue_management: wait for logs properly The modified lines of code intend to await the first appearance of a log on one of the nodes. But due to misplaced parentheses, instead of creating a list of log-awaiting tasks with a list comprehension, they pass a generator expression to asyncio.create_task(). This is nonsense, and it fails immediately with a type error. But since they don't actually check the result of the await, the test just assumes that the search completed successfully. This was uncovered by an upgrade to Python 3.12, because its typing is stronger and asyncio.create_task() screams when it's passed a regular generator. This patch fixes the bad list comprehension, and also adds an error check on the completed awaitables (by calling `await` on them). Fixes #18740 Closes scylladb/scylladb#18754	2024-05-25 10:54:44 +03:00
Andrei Chekun	bce53efd36	Enrich test results produced by test.py This PR resolves issue with double count of the test result for topology tests. It will not appear in the consolidated report anymore. Another fix is to provide a better view which test failed by modifying the test case name in the report enriching it with mode and run id, so making them unique across the run. The scope of this change is: 1. Modify the test name to have run id in name 2. Add handlers to get logs of test.py and pytest in one file that are related to test, rather than to the full suite 3. Remove topology tests from aggregating them on a suite level in Junit results 4. Add a link to the logs related to the failed tests in Junit results, so it will be easier to navigate to all logs related to test 5. Gather logs related to the failed test to one directory for better logs investigation Ref: scylladb/scylladb#17851 Closes scylladb/scylladb#18277	2024-05-20 15:33:57 +02:00
Artsiom Mishuta	fb6b572b9e	test/topology: make ManagerClient object function scope move ManagerClient object creation/clear to functions scope instead of session scope to prevent test cases affect each other by stopping sharing connections to cluster between tests	2024-05-14 14:31:10 +02:00
Gleb Natapov	3b40d450e5	gossiper: try to locate an endpoint by the host id when applying state if search by IP fails Even if there is no endpoint for the given IP the state can still belong to existing endpoint that was restarted with different IP, so lets try to locate the endpoint by host id as well. Do it in raft topology mode only to not have impact on gossiper mode. Also make the test more robust in detecting wrong amount of entries in the peers table. Today it may miss that there is a wrong entry there because the map will squash two entries for the same host id into one. Fixes: scylladb/scylladb#18419 Fixes: scylladb/scylladb#18457	2024-05-09 13:14:54 +02:00
Kamil Braun	4dcae66380	Merge 'test: {auth,topology}: use manager.rolling_restart' from Piotr Dulikowski Instead of performing a rolling restart by calling `restart` in a loop over every node in the cluster, use the dedicated `manager.rolling_restart` function. This method waits until all other nodes see the currently processed node as up or down before proceeding to the next step. Not doing so may lead to surprising behavior. In particular, in scylladb/scylladb#18369, a test failed shortly after restarting three nodes. Because nodes were restarted one after another too fast, when the third node was restarted it didn't send a notification to the second node because it still didn't know that the second node was alive. This led the second node to notice that the third node restarted by observing that it incremented its generation in gossip (it restarted too fast to be marked as down by the failure detector). In turn, this caused the second node to send "third node down" and "third node up" notifications to the driver in a quick succession, causing it to drop and reestablish all connections to that node. However, this happened _after_ rolling upgrade finished and _after_ the test logic confirmed that all nodes were alive. When the notifications were sent to the driver, the test was executing some statements necessary for the test to pass - as they broke, the test failed. Fixes: scylladb/scylladb#18369 Closes scylladb/scylladb#18379 * github.com:scylladb/scylladb: test: get rid of server-side server_restart test: util: get rid of the `restart` helper test: {auth,topology}: use manager.rolling_restart	2024-05-08 09:45:08 +02:00
Piotr Dulikowski	8de2bda7ae	test: util: get rid of the `restart` helper We already have `ManagerClient.server_restart`, which can be used in its place.	2024-05-06 12:24:40 +02:00
Gleb Natapov	f2b0a5e9e1	storage_service: do not take API lock for removenode operation if topology coordinator is enabled Topology coordinator serialize operations internally, so there is no need to have an external lock. Fixes: scylladb/scylladb#17681	2024-04-30 15:13:50 +03:00
Aleksandra Martyniuk	e0d498716a	test: topology: move some functions to util.py Move functions marked with asynccontextmanager from test/topology/test_mv.py to test/topology/util.py so that they can be used in other tests.	2024-04-24 10:57:51 +02:00
Aleksandr Bykov	e8833c6f2a	test: Kill coordinator during topology operation If coordinator node was killed, restarted, become not operatable during topology operation, new coordinator should be elected, operation should be aborted and cluster should be rolled back Error injection will be used to kill the coordinator before streaming starts Closes scylladb/scylladb#16197	2024-04-17 17:24:20 +02:00
Mikołaj Grzebieluch	1e2607563f	test.py: test_topology_upgrade_basic: make ring_delay_ms nonzero Test.py uses `ring_delay_ms = 0` by default. CDC creates generation's timestamp by adding `ring_delay_ms` to it. In this test, nodes are learning about new generations (introduced by upgrade procedure and then by node bootstrap) concurrently with doing writes that should go to these generations. Because of `ring_delay_ms = 0', the generation could have been committed when it should have already been in use. This can be seen in the following logs from a node: ``` ERROR 2024-03-22 12:29:55,431 [shard 0:strm] cdc - just learned about a CDC generation newer than the one used the last time streams were retrieved. This generation, or some newer one, should have been used instead (new generation's timestamp: 2024/03/22 12:29:54, last time streams were retrieved: 2024/03/22 12:29:55). The new generation probably arrived too late due to a network partition and we've made a write using the wrong set streams. ``` Creating writes during such a generation can result in assigning them a wrong generation or a failure. Failure may occur if it hits short time window when `generation_service::handle_cdc_generation(cdc::generation_id_v2)` has executed `svc._cdc_metadata.prepare(...)` but`_cdc_metadata.insert(...)` has not yet been executed. With a nonzero ring_delay_ms it's not a problem, because during this time window, the generation should not be in use. Write can fail with the following response from a node: ``` cdc: attempted to get a stream from a generation that we know about, but weren't able to retrieve (generation timestamp: 2024/03/22 12:29:54, write timestamp: 2024/03/22 12:29:55). Make sure that the replicas which contain this generation's data are alive and reachable from this node. ``` Set ring_delay_ms to 15000 for the debug mode and 5000 in other modes. Wait for the last generation to be in use and sleep one second to make sure there are writes to the CDC table in this generation. Fixes #17977	2024-03-28 17:13:43 +01:00
Mikołaj Grzebieluch	fa4193e09f	Reapply "test.py: adjust the test for topology upgrade to write to and read from CDC tables" This reverts commit `230f23004b`.	2024-03-27 10:39:01 +01:00
Pavel Emelyanov	16343b3edc	test: Do not duplicate test name in several skip-lists Some tests are only run in dev mode for some reason. For such tests there's run_in_dev list, no need in putting it in all the non-dev skip_in_... ones. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-25 14:56:37 +03:00
Pavel Emelyanov	90dfcec86b	test: Mark tests with skip_mode instead of suite skip-list There are many tests that are skipped in release mode becuase they rely on error-injection machinery which doesn't work in release mode. Most of those tests are listed in suite's skip_in_release, but it's not very handy, mainly because it's not clear why the test is there. The skip_mode decoration is much more convenient. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-25 14:56:37 +03:00
Kamil Braun	230f23004b	Revert "test.py: adjust the test for topology upgrade to write to and read from CDC tables" This reverts commit `b4144d14c6`. The test is flaky and blocks next promotions.	2024-03-22 17:25:04 +01:00
Piotr Dulikowski	f23f8f81bf	Merge 'Raft-based service levels' from Michał Jadwiszczak This patch introduces raft-based service levels. The difference to the current method of working is: - service levels are stored in `system.service_levels_v2` - reads are executed with `LOCAL_ONE` - writes are done via raft group0 operation Service levels are migrated to v2 in topology upgrade. After the service levels are migrated, `key: service_level_v2_status; value: data_migrated` is written to `system.scylla_local` table. If this row is present, raft data accessor is created from the beginning and it handles recovery mode procedure (service levels will be read from v2 table even if consistent topology is disabled then) Fixes #17926 Closes scylladb/scylladb#16585 * github.com:scylladb/scylladb: test: test service levels v2 works in recovery mode test: add test for service levels migration test: add test for service levels snapshot test:topology: extract `trigger_snapshot` to utils main: create raft dda if sl data was migrated service:qos: store information about sl data migration service:qos: service levels migration main: assign standard service level DDA before starting group0 service:qos: fix `is_v2()` method service:qos: add a method to upgrade data accessor test: add unit_test_raft_service_levels_accessor service:storage_service: add support for service levels raft snapshot service:qos: add abort_source for group0 operations service:qos: raft service level distributed data accessor service:qos: use group0_guard in data accessor cql3:statements: run service level statements on shard0 with raft guard test: fix overrides in unit_test_service_levels_accessor service:qos: fix indentation service:qos: coroutinize some of the methods db:system_keyspace: add `SERVICE_LEVELS_V2` table service:qos: extract common service levels' table functions	2024-03-22 11:51:53 +01:00
Kamil Braun	4359a1b460	Merge 'raft timeouts: better handling of lost quorum' from Petr Gusev In this PR we add timeouts support to raft groups registry. We introduce the `raft_server_with_timeouts` class, which wraps the `raft::server` add exposes its interface with additional `raft_timeout` parameter. If it's set, the wrapper cancels the `abort_source` after certain amount of time. The value of the timeout can be specified either in the `raft_timeout` parameter, or the default value can be set in `the raft_server_with_timeouts` class constructor. The `raft_group_registry` interface is extended with `group0_with_timeouts()` method. It returns an instance of `raft_server_with_timeouts` for group0 raft server. The timeout value for it is configured in `create_server_for_group0`. It's one minute by default and can be overridden for tests with `group0-raft-op-timeout-in-ms` parameter. The new api allows the client to decide whether to use timeouts or not. In this PR we are reviewing all the group0 call sites and add `raft_timeout` if that makes sense. The general principle is that if the code is handling a client request and the client expects a potential error, we use timeouts. We don't use timeouts for background fibers (such as topology coordinator), since they wouldn't add much value. The only thing the background fiber can do with a timeout is to retry, and this will have the same end effect as not having a timeout at all. Fixes scylladb/scylladb#16604 Closes scylladb/scylladb#17590 * github.com:scylladb/scylladb: migration_manager: use raft_timeout{} storage_service::join_node_response_handler: use raft_timeout{} storage_service::start_upgrade_to_raft_topology: use raft_timeout{} storage_service::set_tablet_balancing_enabled: use raft_timeout{} storage_service::move_tablet: use raft_timeout{} raft_check_and_repair_cdc_streams: use raft_timeout{} raft_timeout: test that node operations fail properly raft_rebuild: use raft_timeout{} do_cluster_cleanup: use raft_timeout{} raft_initialize_discovery_leader: use raft_timeout{} update_topology_with_local_metadata: use with_timeout{} raft_decommission: use raft_timeout{} raft_removenode: use raft_timeout{} join_node_request_handler: add raft_timeout to make_nonvoters and add_entry raft_group0: make_raft_config_nonvoter: add raft_timeout parameter raft_group0: make_raft_config_nonvoter: add abort_source parameter manager_client: server_add with start=false shouldn't call driver_connect scylla_cluster: add seeds parameter to the add_server and servers_add raft_server_with_timeouts: report the lost quorum join_node_request_handler: add raft_timeout{} for start_operation skip_mode: add platform_key auth: use raft_timeout{} raft_group0_client: add raft_timeout parameter raft_group_registry: add group0_with_timeouts utils: add composite_abort_source.hh error_injection: move api registration to set_server_init error_injection: add inject_parameter method error_injection: move injection_name string into injection_shared_data error_injection: pass injection parameters at startup	2024-03-22 10:45:33 +01:00
Michał Jadwiszczak	bf3aed1ecb	test:topology: extract `trigger_snapshot` to utils The function was defined separately in a few tests.	2024-03-21 23:14:57 +01:00
Petr Gusev	854531ae8e	skip_mode: add platform_key In subsequent commits we are going to add test.py tests for raft_timeout{} feature. The problem is that aarch/debug configuration is infamously slow. Timeout settings used in tests work for all platforms but aarch/debug. In this commit we extend the skip_mode attribute with the platform_key property. We'll use @skip_mode('debug', platform_key='aarch64') to skip the tests for this specific configuration. The tests will still be run for aarch64/release.	2024-03-21 16:35:43 +04:00
Mikołaj Grzebieluch	b4144d14c6	test.py: adjust the test for topology upgrade to write to and read from CDC tables In topology on raft, management of CDC generations is moved to the topology coordinator. We need to verify that the CDC keeps working correctly during the upgrade for topology on the raft. A similar change will be made in the topology recovery test. It will reuse the `start_writes_to_cdc_table` function. Ref #17409 Closes scylladb/scylladb#17828	2024-03-20 11:15:02 +01:00
Tomasz Grabiec	a233a699cc	test: py: Add test for view replica pairing after replace	2024-03-15 13:20:08 +01:00
Patryk Jędrzejczak	f1d9248df9	test: wait for CDC generations publishing before checking CDC-topology consistency Tests that verify upgrading to the raft-based topology (`test_topology_upgrade`, `test_topology_recovery_basic`, `test_topology_recovery_majority_loss`) have flaky `check_system_topology_and_cdc_generations_v3_consistency` calls. `assert topo_results[0] == topo_res` can fail because of different `unpublished_cdc_generations` on different nodes. The upgrade procedure creates a new CDC generation, which is later published by the CDC generation publisher. However, this can happen after the upgrade procedure finishes. In tests, if publishing happens just before querying `system.topology` in `check_system_topology_and_cdc_generations_v3_consistency`, we can observe different `unpublished_cdc_generations` on different nodes. It is an expected and temporary inconsistency. For the same reasons, `check_system_topology_and_cdc_generations_v3_consistency` can fail after adding a new node. To make the tests not flaky, we wait until the CDC generation publisher finishes its job. Then, all nodes should always have equal (and empty) `unpublished_cdc_generations`. Fixes scylladb/scylladb#17587 Fixes scylladb/scylladb#17600 Fixes scylladb/scylladb#17621 Closes scylladb/scylladb#17622	2024-03-04 19:28:51 +02:00
Patryk Jędrzejczak	2b724735d1	test: test clean-up of committed_cdc_generations We extend `test_cdc_generation_clearing`. Now, it also tests the clean-up of `TOPOLOGY.committed_cdc_generations` added in the previous patch. In the implementation, we harden the already existing `check_system_topology_and_cdc_generations_v3_consistency`. After the previous patch, data of every generation present in `committed_cdc_generations` should be present in CDC_GENERATIONS_V3. In other words, `committed_cdc_generations` should always be a subset of a set containing generations in CDC_GENERATIONS_V3. Before the previous patch, this wasn't true after the clearing, so the new version of `test_cdc_generation_clearing` wouldn't pass back then.	2024-02-20 12:35:18 +01:00
Patryk Jędrzejczak	e145e758eb	raft topology: store committed CDC generations' IDs in the topology When we create a CDC generation and ring-delay is non-zero, the timestamp of the new generation is in the future. Hence, we can have multiple generations that can be written to. However, if we add a new node to the cluster with the Raft-based topology, it receives only the last committed generation. So, this node will be rejecting writes considered correct by the other nodes until the last committed generation starts operating. In scylladb/scylladb#17134, we have allowed sending writes to the previous CDC generations. So, the situation became even more complicated. We need to adjust the Raft-based topology to ensure all required generations are loaded into memory and their data isn't cleared too early. This patch is the first step of the adjustment. We replace `current_cdc_generation_{uuid, timestamp}` with the set containing IDs of all committed generations - `committed_cdc_generations`. This set is sorted by timestamps, just like `unpublished_cdc_generations`. This patch is mostly refactoring. The last generation in `committed_cdc_generations` is the equivalent of the previous `current_cdc_generation_{uuid, timestamp}`. The other generations are irrelevant for now. They will be used in the following patches. After introducing `committed_cdc_generations`, a newly committed generation is also unpublished (it was current and unpublished before the patch). We introduce `add_new_committed_cdc_generation`, which updates both sets of generations so that we don't have to call `add_committed_cdc_generation` and `add_unpublished_cdc_generation` together. It's easy to forget that both of them are necessary. Before this patch, there was no call to `add_unpublished_cdc_generation` in `topology_coordinator::build_coordinator_state`. It was a bug reported in scylladb/scylladb#17288. This patch fixes it. This patch also removes "the current generation" notion from the Raft-based topology. For the Raft-based topology, the current generation was the last committed generation. However, for the `cdc::metadata`, it was the generation operating now. These two generations could be different, which was confusing. For the `cdc::metadata`, the current generation is relevant as it is handled differently, but for the Raft-based topology, it isn't. Therefore, we change only the Raft-based topology. The generation called "current" is called "the last committed" from now.	2024-02-20 12:35:16 +01:00
Kamil Braun	50ebce8acc	Merge 'Purge old ip on change' from Petr Gusev When a node changes IP address we need to remove its old IP from `system.peers` and gossiper. We do this in `sync_raft_topology_nodes` when the new IP is saved into `system.peers` to avoid losing the mapping if the node crashes between deleting and saving the new IP. We also handle the possible duplicates in this case by dropping them on the read path when the node is restarted. The PR also fixes the problem with old IPs getting resurrected when a node changes its IP address. The following scenario is possible: a node `A` changes its IP from `ip1` to `ip2` with restart, other nodes are not yet aware of `ip2` so they keep gossiping `ip1`. After restart `A` receives `ip1` in a gossip message and calls `handle_major_state_change` since it considers it as a new node. Then `on_join` event is called on the gossiper notification handlers, we receive such event in `raft_ip_address_updater` and reverts the IP of the node A back to ip1. To fix this we ensure that the new gossiper generation number is used when a node registers its IP address in `raft_address_map` at startup. The `test_change_ip` is adjusted to ensure that the old IPs are properly removed in all cases, even if the node crashes. Fixes #16886 Fixes #16691 Fixes #17199 Closes scylladb/scylladb#17162 * github.com:scylladb/scylladb: test_change_ip: improve the test raft_ip_address_updater: remove stale IPs from gossiper raft_address_map: add my ip with the new generation system_keyspace::update_peer_info: check ep and host_id are not empty system_keyspace::update_peer_info: make host_id an explicit parameter system_keyspace::update_peer_info: remove any_set flag optimisation system_keyspace: remove duplicate ips for host_id system_keyspace: peers table: use coroutines storage_service::raft_ip_address_updater: log gossiper event name raft topology: ip change: purge old IP on_endpoint_change: coroutinize the lambda around sync_raft_topology_nodes	2024-02-15 17:40:29 +01:00
Petr Gusev	c4140678ba	test_change_ip: improve the test In this commit we refactor test_change_ip to improve it in several ways: * We inject failure before old IP is removed and verify that after restart the node sees the proper peers - the new IP for node2 and old IP for node3, which is not restarted yet. * We introduce the lambda wait_proper_ips, which checks not only the system.peers table, but also gossiper and token_metadata. * We call this lambda for all nodes, not only the first node; this allows to validate that the node that has changed its IP has the proper IP of itself in the data structures above. Note that we need to inject an additional delay ip-change-raft-sync-delay before old IP is removed. Otherwise the problem stop reproducing - other nodes remove the old IP before it's send back to the just restarted node.	2024-02-15 13:26:02 +04:00
Nadav Har'El	b97ded5c4a	test/topology: tests for setting tombstone_gc on materialized view A user asked on the ScyllaDB forum several questions on whether tombstone_gc works on materialized views. This patch includes two tests that confirm the following: 1. The tombstone_gc may be set on a view - either during its creation with CREATE MATERIALIZED VIEW or later with ALTER MATERIALIZED VIEW. 2. The tombstone_gc setting is correctly shown - for both base tables and views - by the "DESC" statement. 3. The tombstone_gc setting is NOT inherited from a base table to a new view - if you want this option on a view, you need to set it separately. Unfortunately, this test could not be a single-node cql-pytest because we forbid tombstone_gc=repair when RF=1, and since recently, we forbid setting RF>1 on a single-node setup. So the new tests are written in the test/topology framework - which may run multiple tests against a single three-node cluster run multiple tests against it. To write tests over a shared cluster, we need functions which create temporary keyspaces, tables and views, which are deleted automatically as soon as a test ends. The test/topology framework was lacking such functions, so this tests includes them - currently inside the test file, but if other people find them useful they can be moved to a more central location. The new functions, net_test_keyspace(), new_test_table() and new_materialized_view() are inspired by the identically-named functions in test/cql-pytest/util.py, but the implementation is different: Importantly, the new functions here are async context managers, used via "async with", to fit with the rest of the asynchronous code used in the topology test framework. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#17345	2024-02-15 09:43:30 +02:00
Gleb Natapov	7802c206c7	test: add test for the new ignore nodes behaviour The test checks that once a node is specified in ignored node list by one topology operation the information is carried over to the next operation as well.	2024-02-14 10:35:11 +02:00
Gleb Natapov	363af9e664	topology coordinator: ban ignored nodes just like we ban nodes that are left Since now a node that is at one point was marked as dead, either via --ignore-dead-nodes parameter or by been a target for removenode or replace, can no longer be made "undead" we need to make sure that they cannot rejoin the cluster any longer. Do that by banning them on a messaging layer just like we do for nodes that are left. Not that the removenode failure test had to be altered since it restarted a node after removenode failure (which now will not work). Also, since the check for liveness was removed from the topology coordinator (because the node is already banned by then), the test case that triggers the removed code is removed as well.	2024-02-14 10:35:06 +02:00
Piotr Dulikowski	4d4976feb0	test/topology_custom: upgrade/recovery tests for topology on raft Adds three tests for the new upgrade procedure: - test_topology_upgrade - upgrades a cluster operating in legacy mode to use raft topology operations, - test_topology_recovery_basic - performs recovery on a three-node cluster, no node removal is done, - test_topology_majority_loss - simulates a majority loss scenario, i.e. removed two nodes out of three, performs recovery to rebuild the raft topology state and re-add two nodes back.	2024-02-08 19:12:28 +01:00
Patryk Jędrzejczak	25b90f5554	raft topology: make rollback_to_normal a transition state After changing `left_token_ring` from a node state to a transition state in scylladb/scylladb#17009, we do the same for `rollback_to_normal`. `rollback_to_normal` was created as a node state because `left_token_ring` was a node state. This change will allow us to distinguish a failed removenode from a failed decommission in the `rollback_to_normal` handler. Currently, we use the same logic for both of them, so it's not required. However, this might change, as it has happened with the decommission and the failed bootstrap/replace in the `left_token_ring` state (scylladb/scylladb#16797). We are making this change now because it would be much harder after branching. The change also simplifies the code in `topology_coordinator:rollback_current_topology_op`. Moving the `rollback_to_normal` handler from `handle_node_transition` to `handle_topology_transition` created a large diff. There is only one change - adding `auto node = get_node_to_work_on(std::move(guard));`.	2024-02-02 16:55:20 +01:00
Kamil Braun	bb22e06a9e	Merge 'abort failed rebuild instead of retrying it forever' from Gleb Add error handling to rebuild instead of retrying it until succeeds. * 'gleb/rebuild-fail-v2' of github.com:scylladb/scylla-dev: test: add test for rebuild failure test: add expected_error to rebuild_node operation topology_coordinator: Propagate rebuild failure to the initiator	2024-01-31 10:07:28 +01:00
Kamil Braun	39339b9f70	test: topology/util: update comment for `reconnect_driver` The issues mentioned in the comment before are already fixed. Unfortunately, there is another, opposite issue which this function can be used for. The previous issue was about the existing driver session not reconnecting. The current issue is about the existing driver session reconnecting too much... (and in the middle of queries.)	2024-01-30 15:36:48 +01:00
Gleb Natapov	8b50613465	test: add test for rebuild failure	2024-01-30 11:04:19 +02:00
Patryk Jędrzejczak	b0eef50b2e	raft topology: make left_token_ring a transition state A node can be in the `left_token_ring` state after: - a finished decommission, - a failed bootstrap, - a failed replace. When a node is in the `left_token_ring` state, we don't know how it has ended up in this state. We cannot distinguish a node that has finished decommissioning from a node that has failed bootstrap. The main problem it causes is that we incorrectly send the `barrier_and_drain` command to a node that has failed bootstrapping or replacing. We must do it for a node that has finished decommissioning because it could still coordinate requests. However, since we cannot distinguish nodes in the `left_token_ring` state, we must send the command to all of them. This issue appeared in scylladb/scylladb#16797 and this patch is a follow-up that fixes it. The solution is changing `left_token_ring` from a node state to a transition state. Regarding implementation, most of the changes are simple refactoring. The less obvious are: - Before this patch, in `system_keyspace::left_topology_state`, we had to keep the ignored nodes' IDs for replace to ensure that the replacing node will have access to it after moving to the `left_token_ring` state, which happens when replace fails. We don't need this workaround anymore. When we enter the new `left_token_ring` transition state, the new node will still be in the `decommissioning` state, so it won't lose its request param. - Before this patch, a decommissioning node lost its tokens while moving to the `left_token_ring` state. After the patch, it loses tokens while still being in the `decommissioning` state. We ensure that all `decommissioning` handlers correctly handle a node that lost its tokens. Moving the `left_token_ring` handler from `handle_node_transition` to `handle_topology_transition` created a large diff. There are only three changes: - adding `auto node = get_node_to_work_on(std::move(guard));`, - adding `builder.del_transition_state()`, - changing error logged when `global_token_metadata_barrier` fails.	2024-01-29 10:39:07 +01:00
Mikołaj Grzebieluch	ca35e352f5	test.py: generalize usage of cluster_con Add option to pass load_balancing policy. Change hosts type to list of IPs or cassandra.Endpoint.	2024-01-25 15:27:53 +01:00
Kamil Braun	71957b4320	storage_service: separate logger for raft topology Allows selectively enabling higher logging levels for just raft-topology related things, without doing it for the entire storage_service (which includes things like gossiper callbacks). Also gets rid of the redundant "raft topology:" prefix which was also not included everywhere.	2024-01-18 11:24:14 +01:00
Pavel Emelyanov	ae7da54f88	test: Move test_tablet_drain_failure_during_decommission to another suite In its current location it will be started with 3 pre-created scylla nodes with default features ON. Next patch will exclude `tablets` from the default list, so the test needs to create servers on its own Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-01-15 13:12:12 +03:00
Pavel Emelyanov	941f6d8fca	cql: Move initial_tablets from REPLICATION to TABLETS in DDL This patch changes the syntax of enabling tablets from CREATE KEYSPACE ... WITH REPLICATION = { ..., 'initial_tablets': <int> } to be CREATE KEYSPACE ... WITH TABLETS = { 'initial': <int> } and updates all tests accordingly. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-01-15 13:04:48 +03:00
Gleb Natapov	f8b90aeb14	test: add test for automatic cleanup procedure The test runs two bootstraps and checks that there is no cleanup in between. Then it runs a decommission and checks that cleanup runs automatically and then it runs one more decommission and checks that no cleanup runs again. Second part checks manual cleanup triggering. It adds a node, triggers cleanup through the REST API, checks that is runs, decommissions a node and check that the cleanup did not run again.	2024-01-14 15:45:53 +02:00
Gleb Natapov	5882855669	test: add test for topology requests queue management This test creates a 5 node cluster with 2 down nodes (A and B). After that it creates a queue of 3 topology operation: bootstrap, removenode A and removenode B with ignore_nodes=A. Check that all operation manage to complete. Then it downs one node and creates a queue with two requests: bootstrap and decommission. Since none can proceed both should be canceled.	2024-01-14 15:45:53 +02:00
Gleb Natapov	0e68073b22	test: use servers_see_each_other when needed In the next patch we want to abort topology operations if there is no enough live nodes to perform them. This will break tests that do a topology operation right after restarting a node since a topology coordinator may still not see the restarted node as alive. Fix all those tests to wait between restart and a topology operation until UP state propagates.	2024-01-14 14:44:07 +02:00
Petr Gusev	6e7bbc94f4	gossiper_state_change_subscriber_proxy: call sync_raft_topology_nodes When a node changes its IP we need to store the mapping in system.peers and update token_metadata.topology and erm in-memory data structures. The test_change_ip was improved to verify this new behaviour. Before this patch the test didn't check that IPs used for data requests are updated on IP change. In this commit we add the read/write check. It fails on insert with 'node unavailable' error without the fix.	2024-01-12 18:28:57 +04:00
Kamil Braun	d4f4b58f3a	Merge 'topology_coordinator: reject removenode if the removed node is alive' from Patryk Jędrzejczak The removenode operation is defined to succeed only if the node being removed is dead. Currently, we reject this operation on the initiator side (in `storage_service::raft_removenode`) when the failure detector considers the node being removed alive. However, it is possible that even if the initiator considers the node dead, the topology coordinator will consider it alive when handling the topology request. For example, the topology coordinator can use a bigger failure detector timeout, or the node being removed can suddenly resurrect. This PR makes the topology coordinator reject removenode if the node being removed is considered alive. It also adds `test_remove_alive_node` that verifies this change. Fixes scylladb/scylladb#16109 Closes scylladb/scylladb#16584 * github.com:scylladb/scylladb: test: add test_remove_alive_node topology_coordinator: reject removenode if the removed node is alive test: ManagerClient: remove unused wait_for_host_down test: remove_node: wait until the node being removed is dead	2024-01-08 12:39:23 +01:00
Patryk Jędrzejczak	7038a033f2	test: remove_node: wait until the node being removed is dead In the following commits, we make the topology coordinator reject removenode requests if the node being removed is considered alive by the gossiper. Before making this change, we need to adapt the testing framework so that we don't have flaky removenode operations that fail because the node being removed hasn't been marked as dead yet. We achieve this by waiting until all other running nodes see the node being removed as dead in all removenode operations. Some tests are simplified after this change because they don't have to call server_not_sees_other_server anymore.	2023-12-29 17:12:45 +01:00
Gleb Natapov	74d17719db	test: add test to check failure handling in cdc generation commit	2023-12-26 16:01:34 +02:00

1 2 3 4

195 Commits