scylladb

Author	SHA1	Message	Date
Aleksandra Martyniuk	76174d1f7a	cql3: reject ALTER KEYSPACE if rf of datacenter with tablets is omitted In ALTER KEYSPACE, when a datacenter name is omitted, its replication factor is implicitly set to zero with vnodes, while with tablets, it remains unchanged. ALTER KEYSPACE should behave the same way for tablets as it does for vnodes. However, this can be dangerous as we may mistakenly drop the whole datacenter. Reject ALTER KEYSPACE if it changes replication factor, but omits a datacenter that currently contains tablet replicas. Fixes: https://github.com/scylladb/scylladb/issues/25549. Closes scylladb/scylladb#25731	2025-11-24 06:36:51 +02:00
Tomasz Grabiec	87492d3073	test: py: Test scenario involving excludenode API	2025-10-31 09:03:20 +01:00
Avi Kivity	04a289cae6	Merge 'Auto expand to rack list' from Tomasz Grabiec We want to move towards rack-list based replication factor for tablets being the default mode, and in the future the only supported mode. This PR is a step towards that. We auto-expand numeric RF to rack list on keyspace creation and ALTER when rf_rack_valid_keyspaces option is enabled. The PR is mostly about adjusting tests. The main logic change is in the last patch, which modifies option post-processing in ks_prop_defs. Fixes #26397 Closes scylladb/scylladb#26692 * github.com:scylladb/scylladb: cql3: ks_prop_defs: Expand numeric RF to rack list locator: Move rack_list to topology.hh alternator: Do not set RF for zero-token DCs alternator: Switch keyspace creation to use ks_prop_defs test: alternator: Adjust for rack lists cql3: Move validation of invalid ALTER KEYSPACE earlier, to ks_prop_defs test: cqlpy: Mark tests using rack lists as scylla-only test: Switch to rack-list based RF test: Generalize tests to work with both numeric RF and rack lists test: cluster: test_zero_token_nodes_multidc: Adjust to rack list RF test: Prepare for handling errors specific to rack list path test: cluster: dtest: alternator: Force RF=1 in test_putitem_contention test: Create cluster with multiple racks in multi-dc setups test: boost: network_topology_strategy_test: Adjust to rack-list RF test: tablets: Adjust to rack list test: cluster: test_group0_schema_versioning: Use smaller RF to respect rf-rack-validness test: tablets_test: Convert test_per_shard_goal_mixed_dc_rf to be rack-valid test: object_store: test_backup: Adjust for rack lists test: cluster: tablets: Do not move tablet across racks in test_tablet_transition_sanity test: cluster: mv: Do not move tablets across racks test: cluster: util: Fix docstring for parse_replication_options() tablets, topology_coordinator: Skip tablet draining on replace	2025-10-30 21:54:08 +02:00
Tomasz Grabiec	ba53f41f59	test: Switch to rack-list based RF Have to do that before we enable auto-expansion of numeric RF to rack-lists, because those tests alter the replication factor, and altering from rack-list to numeric will not be allowed.	2025-10-29 23:32:58 +01:00
Dawid Mędrek	48cbf6b37a	test/cluster/test_tablets: Migrate dtest We migrate `tablets_test.py::TestTablets::test_moving_tablets_replica_on_node` from dtests to the repository of Scylla. We divide the test into two steps to make testing easier and even possible with RF-rack-valid keyspaces being enforced. Closes scylladb/scylladb#26285	2025-10-29 11:09:48 +02:00
Pavel Emelyanov	948cefa5f9	test: Extend API consistency test with tokens_endpoint endpoint Recently (#26231) there was added a test to check that several API endpoints, that return tokens and corresponding replica nodes, are consistent with tablet map. This patch adds one more API endpoint to the validation -- the /storage_service/tokens_endpoint one. The extention is pretty straightforward, but the new endpoint returns back a single (primary) replica for a token, so the test check is slightly modified to account for that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#26580	2025-10-28 15:18:09 +02:00
Petr Gusev	e1667afa50	topology_coordinator: fix log message	2025-10-22 11:32:37 +02:00
Andrei Chekun	24d17c3ce5	test.py: rewrite the wait_for_first_completed Rewrite wait_for first_completed to return only first completed task guarantee of awaiting(disappearing) all cancelled and finished tasks Use wait_for_first_completed to avoid false pass tests in the future and issues like #26148 Use gather_safely to await tasks and removing warning that coroutine was not awaited Closes scylladb/scylladb#26435	2025-10-22 01:13:43 +03:00
Aleksandra Martyniuk	0e73ce202e	test: wait for cql in test_two_tablets_concurrent_repair_and_migration_repair_writer_level In test_two_tablets_concurrent_repair_and_migration_repair_writer_level safe_rolling_restart returns ready cql. However, get_all_tablet_replicas uses the cql reference from manager that isn't ready. Wait for cql. Fixes: #26328 Closes scylladb/scylladb#26349	2025-10-02 06:41:36 +03:00
Botond Dénes	efd99bb0af	Merge 'Return tablet ranges from range_to_endpoint_map API' from Pavel Emelyanov The handler in question when called for tablets-enabled keyspace, returns ranges that are inconsistent with those from system.tablets. Like this: system.tablets: ``` TabletReplicas(last_token=-4611686018427387905, replicas=[('e43ce450-2834-4137-92b7-379bb37684d1', 0), ('67c82fc2-8ef9-4dd9-8cf6-c7f9372ce207', 0)]) TabletReplicas(last_token=-1, replicas=[('22c84cba-d8d0-4d20-8d46-eb90865bb612', 0), ('67c82fc2-8ef9-4dd9-8cf6-c7f9372ce207', 1)]) TabletReplicas(last_token=4611686018427387903, replicas=[('22c84cba-d8d0-4d20-8d46-eb90865bb612', 1), ('67c82fc2-8ef9-4dd9-8cf6-c7f9372ce207', 1)]) TabletReplicas(last_token=9223372036854775807, replicas=[('e43ce450-2834-4137-92b7-379bb37684d1', 1), ('22c84cba-d8d0-4d20-8d46-eb90865bb612', 0)]) ``` range_to_endpoint_map: ``` {'key': ['-9069053676502949657', '-8925522303269734226'], 'value': ['127.110.40.2', '127.110.40.3']} {'key': ['-8925522303269734226', '-8868737574445419305'], 'value': ['127.110.40.2', '127.110.40.3']} ... {'key': ['-337928553869203886', '-288500562444694340'], 'value': ['127.110.40.1', '127.110.40.3']} {'key': ['-288500562444694340', '105026475358661740'], 'value': ['127.110.40.1', '127.110.40.3']} {'key': ['105026475358661740', '611365860935890281'], 'value': ['127.110.40.1', '127.110.40.3']} ... {'key': ['8307064440200319556', '9117218379311179096'], 'value': ['127.110.40.2', '127.110.40.1']} {'key': ['9117218379311179096', '9125431458286674075'], 'value': ['127.110.40.2', '127.110.40.1']} ``` Not only the number of ranges differs, but also separating tokens do not match (e.g. tokens -2 and 0 belong to different tablets according to system.tablets, but fall into the same "range" in the API result). The source of confusion is that despite storage_service::get_range_to_address_map() is given correct e.r.m. pointer from the table, it still uses token_metadata::sorted_token() to work with. The fix is -- when the e.r.m. is per-table, the tokens should be get from token_metadata's tablet_map (e.g. compare this to storage_service::effective_ownership() -- it grabs tokens differently for vnodes/tables cases). This PR fixes the mentioned problem and adds validation test. The test also checks /storage_service/describe_ring endpoint that happens to return correct set of values. The API is very ancient, so the bug is present in all versions with tablets Fixes #26331 Closes scylladb/scylladb#26231 * github.com:scylladb/scylladb: test: Add validation of data returned by /storage_service endpoints test,lib: Add range_to_endpoint_map() method to rest client api: Indentation fix after previous patches storage_service: Get tablet tokens if e.r.m. is per-table storage_service,api: Get e.r.m. inside get_range_to_address_map() storage_service: Calculate tokens on stack	2025-09-30 11:20:35 +03:00
Michał Chojnowski	2ed2033224	test: in Python tests, prepare some sstable filename regexes for `ms`	2025-09-29 22:15:25 +02:00
Pavel Emelyanov	b30c8a1f25	test: Add validation of data returned by /storage_service endpoints The test compares the ranges that are returned from /describe_ring and /range_to_endpoint_map with the information obtained from system.tablets and makes sure that - the number of ranges - the boundary tokens - the target replicas (nodes only) match. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-09-25 14:53:22 +03:00
Aleksandra Martyniuk	48bbe09c8b	test: fix test_two_tablets_concurrent_repair_and_migration_repair_writer_level test_two_tablets_concurrent_repair_and_migration_repair_writer_level waits for the first node that logs info about repair_writer using asyncio.wait. The done group is never awaited, so we never learn about the error. The test itself is incorrect and the log about repair_writer is never printed. We never learn about that and tests finishes successfully after 10 minutes timeout. Fix the test: - disable hinted handoff; - repair tablets of the whole table: - new table is added so that concurrent migration is possible; - use wait_for_first_completed that awaits done group; - do some cleanups. Remove nightly mark. Fixes: #26148. Closes scylladb/scylladb#26209	2025-09-24 06:40:45 +03:00
Tomasz Grabiec	c9f0a9d0eb	tablets: scheduler: Balance racks separately when rf_rack_valid_keyspaces is true Greatly improves performance of plan making, because we don't consider candidates in other racks, most of which will fail to be selected due to replication constraints (no rack overload). Also (but minor) reduces the overhead of candidate evaluation, as we don't have to evaluate rack load. Enabled only for rf_rack_valid_keyspaces because such setups guarantee that we will not need (because we must not) move tablets across racks, and we don't need to execute the general algorithm for the whole DC. Tested with perf-load-balancing, which performs a single scale-out operation on a cluster which initially has 10 nodes 88 shards each, 2 racks, RF=2, 70 tables, 256 tablets per table. Scale out adds 6 new nodes (same shard count). Time to rebalance the cluster (plan making only, sum of all iterations, no streaming): Before: 16 min 25 s After: 0 min 25 s Before, plan making cost (single incremental iteration) alternated between fast (0.1 [s]) and slow (14.1 [s]): Rebalance iteration 7 took 14.156 [s]: mig=88, bad=88, first_bad=17741, eval=93874484, skiplist=0, skip: (load=0, rack=17653, node=0) Rebalance iteration 8 took 0.143 [s]: mig=88, bad=88, first_bad=88, eval=865407, skiplist=0, skip: (load=0, rack=0, node=0) The slow run chose min and max nodes in different racks, hence the fast path failed to find any candidates and we switched to exhaustive search of candidates in other nodes. After, all iterations are fast (0.1 [s] per rack, 0.2 [s] per plan-making). The plan is twice as large because it combines the output of two subsequent (pre-patch) plan-making calls. Fixes #26016	2025-09-23 00:30:37 +02:00
Botond Dénes	bde7d8ddbd	Merge 'service: pass current session_id to repair rpc' from Aleksandra Martyniuk Currently, in repair_tablet we retrieve session_id from tablet map (and throw if it isn't specified). In case of topology coordinator failover, we may end up in a situation where a node runs outdated repair, treating session of a different operation as the repair's session: - topology coordinator starts repair transition (A); - topology coordinator sends tablet repair rpc to node1; - topology coordinator is separated from the cluster; - new topology coordinator is elected; - new topology coordinator sees waiting repair request (A_2) and executes it; - new repair of the same tablet is requested (B); - new topology coordinator starts repair transition (B); - new topology coordinator sends tablet repair rpc to node2; - node2 starts repair (B) as repair master; - node1 starts repair (A), checks the current session (B), proceeds with repair (B) as repair master. Send current session_id in repair_tablet rpc. If this session_id and session id got from tablet map don't match, an exception is thrown. Fixes: https://github.com/scylladb/scylladb/issues/23318. No backport; changes in rpc signatures Closes scylladb/scylladb#25369 * github.com:scylladb/scylladb: test: check that repair with outdated session_id fails service: pass current session_id to repair rpc	2025-09-17 17:28:35 +03:00
Tomasz Grabiec	a7f10b585e	Merge 'drop table: fix crash on drop table with concurrent cleanup' from Ferenc Szili Consider the following scenario: - A tablet is migrated away from a shard - The tablet cleanup stage closes the storage group's async_gate - A drop table runs truncate which attempts to disable compaction on the tablet with its gate closed. This fails, because table::parallel_foreach_compaction_group() ultimately calls storage_group_manager::parallel_foreach_storage_group() which will not disable compaction if it can't hold the storage group's gate - Truncate calls table::discard_sstables() which checks if the compaction has been disabled, and because it hasn't, it then runs on_internal_error() with "compaction not disabled on table ks.cf during TRUNCATE" which causes a crash Fixes: #25706 This needs to be backported to all supported versions with tablets Closes scylladb/scylladb#25708 * github.com:scylladb/scylladb: test: reproducer and test for drop with concurrent cleanup truncate: check for closed storage group's gate in discard_sstables	2025-09-02 00:02:14 +02:00
Aleksandra Martyniuk	33a547e740	test: check that repair with outdated session_id fails	2025-08-29 17:00:48 +02:00
Ferenc Szili	1b8a44af75	test: reproducer and test for drop with concurrent cleanup This change adds a reproducer and test for issue #25706	2025-08-28 16:51:36 +02:00
Michał Jadwiszczak	cf138da853	test: adjust existing tests - Disable tablets in `test_migration_on_existing_raft_topology`. Because views on tablets are experimental now, we can safely assume that view building coordinator will start with view build status on raft. - Add error injection to pause view building on worker. Used to pause view building process, there is analogous error injection in view_builder. - Do a read barrier in `test_view_in_system_tables` Increases test stability by making sure that the node sees up-to-date group0 state and `system.built_views` is synced. - Wait for view is build in some tests Increases tests stability by making sure that the view is built. - Remove xfail marker from `test_tablet_streaming_with_unbuilt_view` This series fix https://github.com/scylladb/scylladb/issues/21564 and this test should work now.	2025-08-27 10:23:04 +02:00
Artsiom Mishuta	4b975668f6	tiering (test.py): introduce tiering labels introduce tiering marks 1 “unstable” - For unstable tests that will be will continue runing every night and generate up-to-date statistics with failures without failing the “Main” verification path(scylla-ci, Next) 2 “nightly” - for tests that are quite old, stable, and test functionality that rather not be changed or affected by other features, are partially covered in other tests, verify non-critical functionality, have not found any issues or regressions, too long to run on every PR, and can be popped out from the CI run. set 7 long tests(according to statistic in elastic) as nightly(theses 8 tests took 20% of CI run, about 4 hours without paralelization) 1 test as unstable(as exaple ot marker usage) Closes scylladb/scylladb#24974	2025-08-04 15:38:16 +03:00
Taras Veretilnyk	1d6808aec4	topology_coordinator: Make tablet_load_stats_refresh_interval configurable This commits introduces an config option 'tablet_load_stats_refresh_interval_in_seconds' that allows overriding the default value without using error injection. Fixes scylladb/scylladb#24641 Closes scylladb/scylladb#24746	2025-07-31 14:31:55 +03:00
Avi Kivity	f7324a44a2	compaction: demote normal compaction start/end log messages to debug level Compaction is routine and the log messages pollute the log files, hiding important information. All the data is available via `nodetool compactionhistory`. Reduce noise by demoting those log messages to debug level. One test is adjusted to use debug level for compaction, since it listens for those messages. Closes scylladb/scylladb#24949	2025-07-29 08:02:22 +03:00
Aleksandra Martyniuk	a0031ad05e	api: repair_async: forbid repairing tablet keyspaces Return 403 Forbidden if a user tries to repair tablet keyspace with /storage_service/repair_async/ API.	2025-07-24 11:11:09 +02:00
Aleksandra Martyniuk	83c9af9670	test: add test for repair and resize finalization Add test that checks whether repair does not start if there is an ongoing resize finalization.	2025-06-11 16:17:39 +02:00
Evgeniy Naydanov	f6e3fdd778	test.py: rework log_browsing for dtest migration Rework `ScyllaLogFile.wait_for()` method to make it easier to add required methods to ScyllaNode class of ccm-like shim. Also, added `ScyllaLogFile.grep_for_errors()` method and reworked `ScyllaLogFile.grep()`	2025-05-19 11:50:55 +00:00
Dawid Mędrek	c4b32c38a3	test/cluster: Disable rf_rack_valid_keyspaces in problematic tests Some of the tests in the test suite have proven to be more problematic in adjusting to RF-rack-validity. Since we'd like to run as many tests as possible with the `rf_rack_valid_keyspaces` configuration option enabled, let's disable it in those. In the following commit, we'll enable it by default.	2025-05-10 16:30:49 +02:00
Dawid Mędrek	c8c28dae92	test/cluster/test_tablets: Divide rack into two to adjust tests to RF-rack-validity Three tests in the file use a multi-DC cluster. Unfortunately, they put all of the nodes in a DC in the same rack and because of that, they fail when run with the `rf_rack_valid_keyspaces` configuration option enabled. Since the tests revolve mostly around zero-token nodes and how they affect replication in a keyspace, this change should have zero impact on them.	2025-05-10 16:30:46 +02:00
Dawid Mędrek	04567c28a3	test/cluster/test_tablets: Adjust test_tablet_rf_change to RF-rack-validity We reduce the number of nodes and the RF values used in the test to make sure that the test can be run with the `rf_rack_valid_keyspaces` configuration option. The test doesn't seem to be reliant on the exact number of nodes, so the reduction should not make any difference.	2025-05-10 16:30:43 +02:00
Dawid Mędrek	dbb8835fdf	test/cluster: Adjust simple tests to RF-rack-validity We adjust all of the simple cases of cluster tests so they work with `rf_rack_valid_keyspaces: true`. It boils down to assigning nodes to multiple racks. For most of the changes, we do that by: * Using `pytest.mark.prepare_3_racks_cluster` instead of `pytest.mark.prepare_3_nodes_cluster`. * Using an additional argument -- `auto_rack_dc` -- when calling `ManagerClient::servers_add()`. In some cases, we need to assign the racks manually, which may be less obvious, but in every such situation, the tests didn't rely on that assignment, so that doesn't affect them or what they verify.	2025-05-10 16:30:18 +02:00
Aleksandra Martyniuk	76cd707b18	test: test_tablets: wait for cql Wait for cql after rolling restart in test_two_tablets_concurrent_repair_and_migration_repair_writer_level to prevent failing queries. Fixes: #23620. Closes scylladb/scylladb#23796	2025-04-24 21:25:29 +03:00
Tomasz Grabiec	001d3b2415	Merge 'storage_service: preserve state of busy topology when transiting tablet' from Łukasz Paszkowski Commit `876478b84f` ("storage_service: allow concurrent tablet migration in tablets/move API", 2024-02-08) introduced a code path on which the topology state machine would be busy -- in "tablet_draining" or "tablet_migration" state -- at the time of starting tablet migration. The pre-commit code would unconditionally transition the topology to "tablet_migration" state, assuming the topology had been idle previously. On the new code path, this state change would be idempotent if the topology state machine had been busy in "tablet_migration", but the state change would incorrectly overwrite the "tablet_draining" state otherwise. Restrict the state change to when the topology state machine is idle. In addition, add the topology update to the "updates" vector with plain push_back(). emplace_back() is not helpful here, as topology_mutation_builder::build() cannot construct in-place, and so we invoke the "canonical_mutation" move constructor once, either way. Unit test: Start a two node cluster. Create a single tablet on one of the nodes. Start decommissioning that node, but block decommissioning at once. In that state (i.e., in "tablet_draining"), move the tablet manually to the other node. Check that transit_tablet() leaves the topology transition state alone. Fixes https://github.com/scylladb/scylladb/issues/20073. Commit `876478b84f` was first released in scylla-6.0.0, so we might want to backport this patch accordingly. Closes scylladb/scylladb#23751 * github.com:scylladb/scylladb: storage_service: add unit test for mid-decommission transit_tablet() storage_service: preserve state of busy topology when transiting tablet	2025-04-16 00:19:24 +02:00
Laszlo Ersek	841ca652a0	storage_service: add unit test for mid-decommission transit_tablet() Start a two node cluster. Create a single tablet on one of the nodes. Start decommissioning that node, but block decommissioning at once. In that state (i.e., in "tablet_draining"), move the tablet manually to the other node. Check that transit_tablet() leaves the topology transition state alone. Signed-off-by: Laszlo Ersek <laszlo.ersek@scylladb.com>	2025-04-15 15:15:25 +02:00
Dawid Mędrek	a59842257a	test: Move test_alter_tablet_keyspace_rf to cluster suite We move the test `test_alter_tablet_keyspace_rf` from the cqlpy to the cluster test suite. The reason behind the change is that the test cannot be run with `rf_rack_valid_keyspaces` turned on in the configuration. During the test, we make the keyspace RF-rack-invalid multiple times. Since RF-rack-validity is a very strong constraint, adjust the test otherwise is impossible. By moving it to the cluster test suite, we're able to change the configuration of the node used in the test, and so the test can work again.	2025-04-11 14:55:11 +02:00
Dawid Mędrek	0ed21d9cc1	test/cluster/test_tablets.py: Fix test errorneous indentation Some of the statements in the test are not indented properly and, as a result, are never run. It's most likely a small mistake, so let's fix it. Closes scylladb/scylladb#23659	2025-04-10 11:06:01 +03:00
Aleksandra Martyniuk	372b562f5e	test: add test for rebuild with repair	2025-04-08 10:42:02 +02:00
Botond Dénes	fcdae20fd1	Merge 'Add tablet enforcing option' from Benny Halevy This series add a new config option: `tablets_mode_for_new_keyspaces` that replaces the existing `enable_tablets` option. It can be set to the following values: disabled: New keyspaces use vnodes by default, unless enabled by the tablets={'enabled':true} option enabled: New keyspaces use tablets by default, unless disabled by the tablets={'disabled':true} option enforced: New keyspaces must use tablets. Tablets cannot be disabled using the CREATE KEYSPACE option `tablets_mode_for_new_keyspaces=disabled` or `tablets_mode_for_new_keyspaces=enabled` control whether tablets are disabled or enabled by default for new keyspaces, respectively. In either cases, tablets can be opted-in or out using the `tablets={'enabled':...}` keyspace option, when the keyspace is created. `tablets_mode_for_new_keyspaces=enforced` enables tablets by default for new keyspaces, like `tablets_mode_for_new_keyspaces=enabled`. However, it does not allow to opt-out when creating new keyspaces by setting `tablets = {'enabled': false}` Refs scylladb/scylla-enterprise#4355 * Requires backport to 2025.1 Closes scylladb/scylladb#22273 * github.com:scylladb/scylladb: boost/tablets_test: verify failure to create keyspace with tablets and non network replication strategy tablets: enforce tablets using tablets_mode_for_new_keyspaces=enforced config option db/config: add tablets_mode_for_new_keyspaces option	2025-04-03 16:32:19 +03:00
Aleksandra Martyniuk	bae6711809	\test: add test to check concurrent migration and repair of two different tablets	2025-04-02 15:30:17 +02:00
Lakshmi Narayanan Sreethar	5b47d84399	topology_coordinator: do not schedule migrations when there are pending resize finalizations Resize finalization is executed in a separate topology transition state, `tablet_resize_finalization`, to ensure it does not overlap with tablet transitions. The topology transitions into the `tablet_resize_finalization` state only when no tablet migrations are scheduled or being executed. If there is a large load-balancing backlog, split finalization might be delayed indefinitely, leaving the tables with large tablets. To fix this, do not schedule tablet migrations on any tables when there are pending resize finalizations. This ensures that migrations from the same table and other unrelated tables do not block resize finalization. Also added a testcase to verify the fix. Fixes #21762 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2025-03-27 10:16:34 +05:30
Benny Halevy	c62865df90	db/config: add tablets_mode_for_new_keyspaces option The new option deprecates the existing `enable_tablets` option. It will be extended in the next patch with a 3rd value: "enforced" while will enable tablets by default for new keyspace but without the posibility to opt out using the `tablets = {'enabled': false}` keyspace schema option. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-03-24 14:54:45 +02:00
Aleksandra Martyniuk	20f9d7b6eb	test: add test to check concurrent tablets migration and repair Add a test to check whether a tablet can be migrated while another tablet is repaired.	2025-03-17 10:37:03 +01:00
Artsiom Mishuta	d1198f8318	test.py: rename topology_custom folder to cluster rename topology_custom folder to cluster as it contains not only topology test cases	2025-03-04 10:32:44 +01:00

41 Commits