scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-23 00:02:37 +00:00

Author	SHA1	Message	Date
Michał Jadwiszczak	5c84cff78a	test/cluster/mv/test_mv_building: add similar test for vnode-based view In the dtest repo, the test run for both vnode and tablet based views. Since in test.py infra we're using error injection to pause the view building process, we need separate tests for those two cases.	2026-05-14 10:52:44 +02:00
Wojciech Mitros	f3cf20803b	test: run test_mv_admission_control_exception on one shard In the test we perform 2 consecutive writes where the first write is supposed to increase the view update backlog above the mv admission control threshold and the second one is expected to be rejected because of that. On each node/shard we have 2 types of view update backlogs: 1. for deciding whether we should admit writes 2. for propagating the backlog information to other nodes/shards. For the second write to be rejected, it must be performed on a node and shard which updated its backlog of type 1. The view update backlog of type 2. is immediately increased on the base table replica. For this backlog to be registered as a backlog of type 1., it needs to be either carried by gossip (happening once every second) or by attaching it to a replica write response. We don't want to increase the runtime of tests unnecessarily, so we don't wait and we rely on the second mechanism. The response to the first base table write (the one causing increase in the backlog) carries the increased backlog to the coordinator of this write. So for the second write to observe the increased backlog, it needs to be coordinated on the same node+shard as the first write. We make sure that both writes are coordinated on the same node+shard by using prepared statements combined with setting the host in `run_async`. Both writes target the same partition and with prepared statements we route them directly to the correct shard. That was the idea, at least. In practice, for the driver to learn the correct shard, it first needs to learn the token->shard mapping from the server. For vnodes it can expect a shard by calculating the token of the affected partition, but for tablets, it had no opportunity to learn the tablet->shard mapping so the first write may route to any shard. Additionally, we aren't guaranteed that the driver established connections to all shards on all nodes at the point of any write. So if a connection finishes establishing between the two writes, this may also cause us to coordinate these 2 writes on different shards, leading to a missed view backlog growth and not-rejected second write. We fix this in this patch by running the test using one shard on each node. This way, as long as we perform both writes on the same node, they'll also be coordinated on the same shard. This also makes the prepared statement and BoundStatement unnecessary — we can use SimpleStatement with FallthroughRetryPolicy directly. Fixes: SCYLLADB-1901 Closes scylladb/scylladb#29862	2026-05-12 17:34:19 +02:00
Wojciech Mitros	ab12083525	test: propagate view update backlog before partition delete In the test_delete_partition_rows_from_table_with_mv case we perform a deletion of a large partition to verify that the deletion will self-throttle when generating many view updates. Before the deletion, we first build the materialized view, which causes the view update backlog to grow. The backlog should be back to empty when the view building finishes, and we do wait for that to happen, but the information about the backlog drop may not be propagated to the delete coordinator in time - the gossip interval is 1s and we perform no other writes between the nodes in the meantime, so we don't make use of the "piggyback" mechanism of propagating view backlog either. If the coordinator thinks that the backlog is high on the replica, it may reject the delete, failing this test. We change this in this patch - after the view is built, we perform an extra write from the coordinator. When the write finishes, the coordinator will have the up-to-date view backlog and can proceed with the DELETE. Additionally, we enable the "update_backlog_immediately" injection, which makes the node backlog (the highest backlog across shards) update immediately after each change. Fixes: SCYLLADB-1795 Closes scylladb/scylladb#29775	2026-05-07 11:33:13 +03:00
Michael Litvak	3468e8de8b	test/mv/test_mv_staging: wait for cql after restart Wait for cql on all hosts after restarting a server in the test. The problem that was observed is that the test restarts servers[1] and doesn't wait for the cql to be ready on it. On test teardown it drops the keyspace, trying to execute it on the host that is not ready, and fails. Fixes SCYLLADB-1632 Closes scylladb/scylladb#29562	2026-04-23 12:40:19 +02:00
Botond Dénes	eb3326b417	Merge 'test.py: migrate all bare skips to typed skip markers' from Artsiom Mishuta should be merged after #29235 Complete the typed skip markers migration started in the plugin PR. Every bare `@pytest.mark.skip` decorator and `pytest.skip()` runtime call across the test suite is replaced with a typed equivalent, making skip reasons machine-readable in JUnit XML and Allure reports. 62 files changed across 8 commits, covering ~127 skip sites in total. Bare `pytest.skip` provides only a free-text reason string. CI dashboards (JUnit, Allure) cannot distinguish between a test skipped due to a known bug, a missing feature, a slow test, or an environment limitation. This makes it hard to track skip debt, prioritize fixes, or filter dashboards by skip category. The typed markers (`skip_bug`, `skip_not_implemented`, `skip_slow`, `skip_env`) introduced by the `skip_reason_plugin` solve this by embedding a `skip_type` field into every skip report entry. \| Type \| Count \| Files \| Description \| \|------\|-------\|-------\|-------------\| \| `skip_bug` \| 24 \| 16 \| Skip reason references a known bug/issue \| \| `skip_not_implemented` \| 10 \| 5 \| Feature not yet implemented in Scylla \| \| `skip_slow` \| 4 \| 3 \| Test too slow for regular CI runs \| \| `skip_not_implemented` (bare) \| 2 \| 1 \| Bare `@pytest.mark.skip` with no reason (COMPACT STORAGE, #3882) \| \| Type \| Count \| Files \| Description \| \|------\|-------\|-------\|-------------\| \| `skip_env` \| ~85 \| 34 \| Feature/config/topology not available at runtime \| \| `skip_bug` \| 2 \| 2 \| Known bugs: Streams on tablets (#23838), coroutine task not found (#22501) \| - Comments: 7 comments/docstrings across 5 files updated from `pytest.skip()` to `skip()` - Plugin hardened: `warnings.warn()` → `pytest.UsageError` for bare `@pytest.mark.skip` at collection time — bare skips are now a hard error, not a warning - Guard tests: New `test/pylib_test/test_no_bare_skips.py` with 3 tests that prevent regression: - AST scan for bare `@pytest.mark.skip` decorators - AST scan for bare `pytest.skip()` runtime calls - Real `pytest --collect-only` against all Python test directories Runtime skip sites use the convenience wrappers from `test.pylib.skip_types`: ```python from test.pylib.skip_types import skip_env ``` Usage: ```python skip_env("Tablets not enabled") ``` 1. test: migrate @pytest.mark.skip to @pytest.mark.skip_bug for known bugs — 24 decorator sites, 16 files 2. test: migrate @pytest.mark.skip to @pytest.mark.skip_not_implemented — 10 decorator sites, 5 files 3. test: migrate @pytest.mark.skip to @pytest.mark.skip_slow — 4 decorator sites, 3 files 4. test: migrate bare @pytest.mark.skip to skip_not_implemented — 2 bare decorators, 1 file 5. test: migrate runtime pytest.skip() to typed skip_env() — ~85 sites, 34 files 6. test: migrate runtime pytest.skip() to typed skip_bug() — 2 sites, 2 files 7. test: update comments referencing pytest.skip() to skip() — 7 comments, 5 files 8. test/pylib: reject bare pytest.mark.skip and add codebase guards — plugin hardening + 3 guard tests - All 60 plugin + guard tests pass (`test/pylib_test/`) - No bare `@pytest.mark.skip` or `pytest.skip()` calls remain in the codebase - `pytest --collect-only` succeeds across all test directories with the hardened plugin SCYLLADB-1349 Closes scylladb/scylladb#29305 * github.com:scylladb/scylladb: test/alternator: replace bare pytest.skip() with typed skip helpers test: migrate new bare skips introduced by upstream after rebase test/pylib: reject bare pytest.mark.skip and add codebase guards test: update comments referencing pytest.skip() to skip_env() test: migrate runtime pytest.skip() to typed skip_bug() test: migrate runtime pytest.skip() to typed skip_env() test: migrate bare @pytest.mark.skip to skip_not_implemented test: migrate @pytest.mark.skip to @pytest.mark.skip_slow test: migrate @pytest.mark.skip to @pytest.mark.skip_not_implemented test: migrate @pytest.mark.skip to @pytest.mark.skip_bug for known bugs	2026-04-22 15:48:27 +03:00
Tomasz Grabiec	cddde464ca	Merge 'service: Support adding/removing a datacenter with tablets by changing RF' from Aleksandra Martyniuk With this change, you can add or remove a DC(s) in a single ALTER KEYSPACE statement. It requires the keyspace to use rack list replication factor. In existing approach, during RF change all tablet replicas are rebuilt at once. This isn't the case now. In global_topology_request::keyspace_rf_change the request is added to a ongoing_rf_changes - a new column in system.topology table. In a new column in system_schema.keyspaces - next_replication - we keep the target RF. In make_rf_change_plan, load balancer schedules necessary migrations, considering the load of nodes and other pending tablet transitions. Requests from ongoing_rf_changes are processed concurrently, independently from one another. In each request racks are processed concurrently. No tablet replica will be removed until all required replicas are added. While adding replicas to each rack we always start with base tables and won't proceed with views until they are done (while removing - the other way around). The intermediary steps aren't reflected in schema. When the Rf change is finished: - in system_schema.keyspaces: - next_replication is cleared; - new keyspace properties are saved; - request is removed from ongoing_rf_changes; - the request is marked as done in system.topology_requests. Until the request is done, DESCRIBE KEYSPACE shows the replication_v2. If a request hasn't started to remove replicas, it can be aborted using task manager. system.topology_requests::error is set (but the request isn't marked as done) and next_replication = replication_v2. This will be interpreted by load balancer, that will start the rollback of the request. After the rollback is done, we set the relevant system.topology_requests entry as done (failed), clear the request id from system.topology::ongoing_rf_changes, and remove next_replication. Fixes: SCYLLADB-567. No backport needed; new feature. Closes scylladb/scylladb#24421 * github.com:scylladb/scylladb: service: fix indentation docs: update documentation test: test multi RF changes service: tasks: allow aborting ongoing RF changes cql3: allow changing RF by more than one when adding or removing a DC service: handle multi_rf_change service: implement make_rf_change_plan service: add keyspace_rf_change_plan to migration_plan service: extend tablet_migration_info to handle rebuilds service: split update_node_load_on_migration service: rearrange keyspace_rf_change handler db: add columns to system_schema.keyspaces db: service: add ongoing_rf_changes to system.topology gms: add keyspace_multi_rf_change feature	2026-04-22 01:46:11 +02:00
Łukasz Paszkowski	d18eb9479f	cql/statement: Create keyspace_metadata with correct initial_tablets count In `ks_prop_defs::as_ks_metadata(...)` a default initial tablets count is set to 0, when tablets are enabled and the replication strategy is NetworkReplicationStrategy. This effectively sets _uses_tablets = false in abstract_replication_strategy for the remaining strategies when no `tablets = {...}` options are specified. As a consequence, it is possible to create vnode-based keyspaces even when tablets are enforced with `tablets_mode_for_new_keyspaces`. The patch sets a default initial tablets count to zero regardless of the chosen replication strategy. Then each of the replication strategy validates the options and raises a configuration exception when tablets are not supported. All tests are altered in the following way: + whenever it was correct, SimpleStrategy was replaced with NetworkTopologyStrategy + otherwise, tablets were explicitly disabled with ` AND tablets = {'enabled': false}` Fixes https://github.com/scylladb/scylladb/issues/25340 Closes scylladb/scylladb#25342	2026-04-20 17:57:38 +03:00
Artsiom Mishuta	465636bc53	test: migrate @pytest.mark.skip to @pytest.mark.skip_bug for known bugs Migrate 24 @pytest.mark.skip decorator sites to @pytest.mark.skip_bug across 16 test files where the reason references a known bug or issue.	2026-04-19 11:06:30 +02:00
Aleksandra Martyniuk	38bad5f316	cql3: allow changing RF by more than one when adding or removing a DC rf_rack_valid_keyspaces relies on the fact that replicas of base table and mv are streamed concurrently. This is no longer true for newly introduced method of adding a DC. Disable rf_rack_valid_keyspaces in test_mv_first_replica_in_dc to force the old method.	2026-04-17 09:58:08 +02:00
Avi Kivity	0ae22a09d4	LICENSE: Update to version 1.1 Updated terms of non-commercial use (must be a never-customer).	2026-04-12 19:46:33 +03:00
Michael Litvak	399260a6c0	test: mv: fix flaky wait for commitlog sync Previously the test test_interrupt_view_build_shard_registration stopped the node ungracefully and used commitlog periodic mode to persist the view build progress in a not very reliable way. It can happen that due to timing issues, the view build progress is not persisted, or some of it is persisted in a different ordering than expected. To make the test more reliable we change it to stop the node gracefully, so the commitlog is persisted in a graceful and consistent way, without using the periodic mode delay. We need to also change the injection for the shutdown to not get stuck. Fixes SCYLLADB-1005 Closes scylladb/scylladb#29008	2026-03-19 10:41:21 +01:00
Piotr Dulikowski	a2669e9983	test: test_mv_merge_allowed: add mistakenly omitted awaits The test test_mv_merge_allowed asserts in two places that the tablet count is 2. It does so by calling an async function but, mistakenly, the returned coroutine was not awaited. The coroutine is, apparently, truthy so the assertions always passed. Fix the test to properly await the coroutines in the assertions. Fixes: SCYLLADB-905 Closes scylladb/scylladb#28875	2026-03-05 11:29:23 +01:00
Michael Litvak	8c4bc33e51	test: remove test_view_building_with_tablet_move remove the test since it's not relevant anymore, it's not testing what it's supposed to test and it's unstable. the purpose of the test was to reproduce an issue in the legacy view builder where a view starts to build at token T2 and then all tokens [T1, end) with T1<T2 migrate to another node while it's still building, exposing an issue when the view builder wraparounds the token ring. this is not relevant anymore because now view building with tablets is done via the view building coordinator for tablets, and all views start to build from the first token with no wraparound. besides, the test is unstable due to relying too much on specific timing, which was useful for investigating and fixing the original issue but not anymore. Fixes SCYLLADB-842 Closes scylladb/scylladb#28842	2026-03-02 07:42:08 +01:00
Alex	5557770b59	test_mv_build_during_shutdown started two async CREATE MATERIALIZED VIEW operations and never awaited them (asyncio.gather(...) without await). This pr adds await for each one of the tasks to wait for the MV schema to be added successfully and then to start the server shutdown With this change we dont need will not get the shutdown races. Closes scylladb/scylladb#28774	2026-02-24 17:25:05 +01:00
Botond Dénes	b637e17b19	db/config: don't use RBNO for scaling Remove bootstrap and decomission from allowed_repair_based_node_ops. Using RBNO over streaming for these operations has no benefits, as they are not exposed to the out-of-date replica problem that replace, removenode and rebuild are. On top of that, RBNO is known to have problems with empty user tables. Using streaming for boostrap and decomission is safe and faster than RBNO in all condition, especially when the table is small. One test needs adjustment as it relies on RBNO being used for all node ops. Fixes: SCYLLADB-105 Closes scylladb/scylladb#28080	2026-02-19 09:51:09 +01:00
Gleb Natapov	08268eee3f	topology: disable force-gossip-topology-changes option The patch marks force-gossip-topology-changes as deprecated and removes tests that use it. There is one test (test_different_group0_ids) which is marked as xfail instead since it looks like gossiper mode was used there as a way to easily achieve a certain state, so more investigation is needed if the tests can be fixed to use raft mode instead. Closes scylladb/scylladb#28383	2026-02-02 09:56:32 +01:00
Andrei Chekun	cc5ac75d73	test.py: remove deprecated skip_mode decorator Finishing the deprecation of the skip_mode function in favor of pytest.mark.skip_mode. This PR is only cleaning and migrating leftover tests that are still used and old way of skip_mode. Closes scylladb/scylladb#28299	2026-01-25 18:17:27 +02:00
Tomasz Grabiec	d3ee82ea51	topology_coordinator, storage_service: Validate node removal/decommission at request submission time After parallel tablet draining, the validation at the time request starts executing is too late, tablets will be already drained. This trips tests which expect validation failure, but get tablet draining failure instead. Also, in case of decommission, it's a waste to go through draining only to discover that the operation has to be rolled back due to validation. So avoid submitting a request altogether if it's invalid. The validation at request execution start remains, for extra sefety. validate_removing_node() was extracted out of topology_coordinator, so that it can be called by storage_service on non-coordinator. Some tests need adjusting for the fact that after failed removenode the node may still not be marked as excluded, so we need to explicitly exclude it or add to the list of ignored nodes in the next removenode operation.	2026-01-18 15:36:04 +01:00
Tomasz Grabiec	5e6935f276	test: Use ManagerClient.{disable,enable}_tablet_balancing()	2026-01-13 00:38:00 +01:00
Botond Dénes	60570d7114	Merge 'topology coordinator: restrict node join/remove to preserve RF-rack validity' from Michael Litvak Allow creating materialized views and secondary indexes in a tablets keyspace only if it's RF-rack-valid, and enforce RF-rack-validity while the keyspace has views by restricting some operations: * Altering a keyspace's RF if it would make the keyspace RF-rack-invalid * Adding a node in a new rack * Removing / Decommissioning the last node in a rack Previously the config option `rf_rack_valid_keyspaces` was required for creating views. We now remove this restriction - it's not needed because we always maintain RF-rack-validity for keyspaces with views. The restrictions are relevant only for keyspaces with numerical RF. Keyspace with rack-list-based RF are always RF-rack-valid. Fixes scylladb/scylladb#23345 Fixes https://github.com/scylladb/scylladb/issues/26820 backport to relevant versions for materialized views with tablets since it depends on rf-rack validity Closes scylladb/scylladb#26354 * github.com:scylladb/scylladb: docs: update RF-rack restrictions cql3: don't apply RF-rack restrictions on vector indexes cql3: add warning when creating mv/index with tablets about rf-rack service/tablet_allocator: always allow tablet merge of tables with views locator: extend rf-rack validation for rack lists test: test rf-rack validity when creating keyspace during node ops locator: fix rf-rack validation during node join/remove test: test topology restrictions for views with tablets test: add test_topology_ops_with_rf_rack_valid topology coordinator: restrict node join/remove to preserve RF-rack validity topology coordinator: add validation to node remove locator: extend rf-rack validation functions view: change validate_view_keyspace to allow MVs if RF=Racks db: enforce rf-rack-validity for keyspaces with views replica/db: add enforce_rf_rack_validity_for_keyspace helper db: remove enforce parameter from check_rf_rack_validity test: adjust test to not break rf-rack validity	2026-01-09 10:01:23 +02:00
Michael Litvak	8f15c7a874	db/view/view_update_generator: move discover_staging_sstables to start Call discover_staging_sstables in view_update_generator::start() instead of in the constructor, because the constructor is called during initialization before sstables are loaded. The initialization order was changed in `5d1f74b86a` and caused this regression. It means the view update generator won't discover staging sstables on startup and view updates won't be generated for them. It also causes issues in sstable cleanup. view_update_generator::start() is called in a later stage of the initialization, after sstable loading, so do the discovery of staging sstables there. Fixes scylladb/scylladb#27956 Closes scylladb/scylladb#27970	2026-01-08 21:55:19 +02:00
Andrei Chekun	c950c2e582	test.py: convert skip_mode function to pytest.mark Function skip_mode works only on function and only in cluster test. This if OK when we need to skip one test, but it's not possible to use it with pytestmark to automatically mark all tests in the file. The goal of this PR is to migrate skip_mode to be dynamic pytest.mark that can be used as ordinary mark. Closes scylladb/scylladb#27853 [avi: apply to test/cluster/test_tablets.py::test_table_creation_wakes_up_balancer]	2026-01-08 21:55:16 +02:00
Michael Litvak	75b5285cdf	cql3: don't apply RF-rack restrictions on vector indexes When creating an index we validate that the keyspace is RF-rack-valid and print a warning that the keyspace must remain RF-rack-valid. This should apply only to indexes that are based on materialized views for which there are consistency concerns when the keyspace is not RF-rack-valid. vector indexes are not based on materialized views, hence these restrictions should not apply to them.	2025-12-22 09:21:07 +01:00
Michael Litvak	06343b58a2	cql3: add warning when creating mv/index with tablets about rf-rack Creating a MV or index in a tablets-based keyspace now forces additional restrictions on the keyspace. The keyspace must be RF-rack-valid and it must remain RF-rack-valid while the view exists. Add a CQL warning about these restrictions.	2025-12-22 09:21:06 +01:00
Michael Litvak	9940dcefa7	test: test topology restrictions for views with tablets Add tests that verify the restrictions on topology operations when there are keyspaces with tablets and materialized views. For such keyspaces, RF=Racks must be enforced while they have materialized views, therefore adding a node in a new rack or removing a node that would eliminate a rack should be rejected.	2025-12-22 09:14:30 +01:00
Michael Litvak	8df61f6d99	view: change validate_view_keyspace to allow MVs if RF=Racks The function validate_view_keyspace checks if a keyspace is eligible for having materialized views, and it is used for validation when creating a MV or a MV-based index. Previously, it was required that the rf_rack_valid_keyspaces option is set in order for tablets-based keyspaces to be considered eligible, and the RF-rack condition was enforced when the option is set. Instead of this, we change the validation to allow MVs in a keyspace if the RF-rack condition is satisfied for the keyspace - regardless of the config option. We remove the config validation for views on startup that validates the option `rf_rack_valid_keyspaces` is set if there are any views with tablets, since this is not required anymore. We can do this without worrying about upgrades because this change will be effective from 2025.4 where MVs with tablets are first out of experimental phase. We update the test for MV and index restrictions in tablets keyspaces according to the new requirements. * Create MV/index: previously the test checked that it's allowed only if the config option `rf_rack_valid_keyspaces` is set. This is changed now so it's always allowed to create MV/index if the keyspace is RF-rack-valid. Update the test to verify that we can create MV/index when the keyspace is RF-rack-valid, even if the rf_rack option is not set, and verify that it fails when the keyspace is RF-rack-invalid. * Alter: Add a new test to verify that while a keyspace has views, it can't be altered to become RF-rack-invalid.	2025-12-22 09:14:29 +01:00
Dawid Mędrek	58dc414912	test/cluster/mv: Rewrite test_view_building_scheduling_group We rewrite the test to avoid flakiness. Instead of looking at the metrics, we make a trade-off and start depending on a less reliable mechanism -- logs. We grep all relevant messages printed by Scylla in TRACE mode and make sure that they were all printed from a context using the streaming scheduling group. Although it's a "less proper" way of testing, it should be much more dependable and avoid flakiness. Fixes scylladb/scylladb#25957 Closes scylladb/scylladb#26656	2025-12-08 14:24:25 +02:00
Aleksandra Martyniuk	76174d1f7a	cql3: reject ALTER KEYSPACE if rf of datacenter with tablets is omitted In ALTER KEYSPACE, when a datacenter name is omitted, its replication factor is implicitly set to zero with vnodes, while with tablets, it remains unchanged. ALTER KEYSPACE should behave the same way for tablets as it does for vnodes. However, this can be dangerous as we may mistakenly drop the whole datacenter. Reject ALTER KEYSPACE if it changes replication factor, but omits a datacenter that currently contains tablet replicas. Fixes: https://github.com/scylladb/scylladb/issues/25549. Closes scylladb/scylladb#25731	2025-11-24 06:36:51 +02:00
Botond Dénes	2ca66133a4	Revert "db/config: don't use RBNO for scaling" This reverts commit `43738298be`. This commit causes instability in dtests. Several non-gating dtests started failing, as well as some gating ones, see #27047. Closes scylladb/scylladb#27067 Fixes #27047	2025-11-18 08:17:17 +02:00
Piotr Dulikowski	2ccc94c496	Merge 'topology_coordinator: include joining node in barrier' from Michael Litvak Previously, only nodes in the 'normal' state and decommissioning nodes were included in the set of nodes participating in barrier and barrier_and_drain commands. Joining nodes are not included because they don't coordinate requests, given their cql port is closed. However, joining nodes may receive mutations from other nodes, for which they may generate and coordinate materialized view updates. If their group0 state is not synchronized it could cause lost view updates. For example: 1. On the topology coordinator, the join completes and the joining node becomes normal, but the joining node's state lags behind. Since it's not synchronized by the barrier, it could be in an old state such as `write_both_read_old`. 2. A normal node coordinates a write and sends it to the new node as the new replica. 3. The new node applies the base mutation but doesn't generate a view update for it, because it calculates the base-view pairing according to its own state and replication map, and determines that it doesn't participate in the base-view pairing. Therefore, since the joining node participates as a coordinator for view updates, it should be included in these barriers as well. This ensures that before the join completes, the joining node's state is `write_both_read_new`, where it does generate view updates. Fixes https://github.com/scylladb/scylladb/issues/26976 backport to previous versions since it fixes a bug in MV with vnodes Closes scylladb/scylladb#27008 * github.com:scylladb/scylladb: test: add mv write during node join test topology_coordinator: include joining node in barrier	2025-11-14 12:41:16 +01:00
Botond Dénes	43738298be	db/config: don't use RBNO for scaling Remove bootstrap and decomission from allowed_repair_based_node_ops. Using RBNO over streaming for these operations has no benefits, as they are not exposed to the out-of-date replica problem that replace, removenode and rebuild are. On top of that, RBNO is known to have problems with empty user tables. Using streaming for boostrap and decomission is safe and faster than RBNO in all condition, especially when the table is small. One test needs adjustment as it relies on RBNO being used for all node ops. Fixes: #24664 Closes scylladb/scylladb#26330	2025-11-14 13:03:50 +03:00
Dawid Mędrek	393f1ca6e6	tet/cluster/mv: Clean up test_backoff_when_node_fails_task_rpc After the changes in the test, we clean up its syntax. It boils down to very simple modifications.	2025-11-13 17:57:33 +01:00
Dawid Mędrek	acd9120181	db/view/view_building_coordinator: Rate limit logging failed RPC The view building coordinator sends tasks in form of RPC messages to other nodes in the cluster. If processing that RPC fails, the coordinator logs the error. However, since tasks are per replica (so per shard), it may happen that we end up with a large number of similar messages, e.g. if the target node has died, because every shard will fail to process its RPC message. It might become even worse in the case of a network partition. To mitigate that, we rate limit the logging by 1 seconds. We extend the test `test_backoff_when_node_fails_task_rpc` so that it allows the view building coordinator to have multiple tablet replica targets. If not for rate limiting the warning messages, we should start getting more of them, potentially leading to a test failure.	2025-11-13 17:57:23 +01:00
Dawid Mędrek	4a5b1ab40a	db/view: Add backoff when RPC fails The view building coordinator manages the process of view building by sending RPC requests to all nodes in the cluster, instructing them what to do. If processing that message fails, the coordinator decides if it wants to retry it or (temporarily) abandon the work. An example of the latter scenario could be if one of the target nodes dies and any attempts to communicate with it would fail. Unfortunately, the current approach to it is not perfect and may result in a storm of warnings, effectively clogging the logs. As an example, take a look at scylladb/scylladb#26686: the gossiper failed to mark one of the dead nodes as DOWN fast enough, and it resulted in a warning storm. To prevent situations like that, we implement a form of backoff. If processing an RPC message fails, we postpone finishing the task for a second. That should reduce the number of messages in the logs and avoid retries that are likely to fail as well. We provide a reproducer test: it fails before this commit and succeeds with it. Fixes scylladb/scylladb#26686	2025-11-13 17:55:41 +01:00
Michael Litvak	b925e047be	test: add mv write during node join test Add a test that reproduces the issue scylladb/scylladb#26976. The test adds a new node with delayed group0 apply, and does writes with MV updates right after the join completes on the coordinator and while the joining node's state is behind. The test fails before fixing the issue and passes after.	2025-11-13 12:24:32 +01:00
Piotr Szymaniak	63897370cb	alternator: Fix tag name to request vnodes The tag was lately renamed from `experimental:initial_tablets` to `system::initial_tablets`. This commit fixes both the tests as well as the exceptions sent to the user instructing how to create table with vnodes.	2025-11-09 12:52:29 +02:00
Wojciech Mitros	0a22ac3c9e	mv: don't mark the view as built if the reader produced no partitions When we build a materialized view we read the entire base table from start to end to generate all required view udpates. If a view is created while another view is being built on the same base table, this is optimized - we start generating view udpates for the new view from the base table rows that we're currently reading, and we read the missed initial range again after the previous view finishes building. The view building progress is only updated after generating view updates for some read partitions. However, there are scenarios where we'll generate no view updates for the entire read range. If this was not handled we could end up in an infinite view building loop like we did in https://github.com/scylladb/scylladb/issues/17293 To handle this, we mark the view as built if the reader generated no partitions. However, this is not always the correct conclusion. Another scenario where the reader won't encounter any partitions is when view building is interrupted, and then we perform a reshard. In this scenario, we set the reader for all shards to the last unbuilt token for an existing partition before the reshard. However, this partition may not exist on a shard after reshard, and if there are also no partitions with higher tokens, the reader will generate no partitions even though it hasn't finished view building. Additionally, we already have a check that prevents infinite view building loops without taking the partitions generated by the reader into account. At the end of stream, before looping back to the start, we advance current_key to the end of the built range and check for built views in that range. This handles the case where the entire range is empty - the conditions for a built view are: 1. the "next_token" is no greater than "first_token" (the view building process looped back, so we've built all tokens above "first_token") 2. the "current_token" is no less than "first_token" (after looping back, we've built all tokens below "first_token") If the range is empty, we'll pass these conditions on an empty range after advancing "current_key" to the end because: 1. after looping back, "next_token" will be set to `dht::minimum_token` 2. "current_key" will be set to `dht::ring_position::max()` In this patch we remove the check for partitions generated by the reader. This fixes the issue with resharding and it does not resurrect the issue with infinite view building that the check was introduced for. Fixes https://github.com/scylladb/scylladb/issues/26523 Closes scylladb/scylladb#26635	2025-11-05 17:02:32 +02:00
Tomasz Grabiec	5bf7112fe6	test: cluster: mv: Do not move tablets across racks It's illegal with rf-rack-valid keyspaces.	2025-10-29 23:32:57 +01:00
Piotr Dulikowski	a8d92f2abd	test: mv: add a test for tablet merge The test test_mv_tablets_replace verifies that merging tablets of both a view and its base table is allowed if rf-rack-valid-keyspaces option is enabled (and it is enabled by default in the test suite).	2025-10-16 14:07:37 +02:00
Dawid Mędrek	3aa07d7dfe	test/cluster/mv: Provide reason why test is skipped We point to the issue explaining why the test was disabled and what can be done about it. Closes scylladb/scylladb#26541	2025-10-15 09:22:39 +02:00
Piotr Dulikowski	e7907b173a	Merge 'db/view: Require rf_rack_valid_keyspaces when creating materialized view' from Dawid Mędrek Materialized views are currently in the experimental phase and using them in tablet-based keyspaces requires starting Scylla with an experimental feature, `views-with-tablets`. Any attempts to create a materialized view or secondary index when it's not enabled will fail with an appropriate error. After considerable effort, we're drawing close to bringing views out of the experimental phase, and the experimental feature will no longer be needed. However, materialized views in tablet-based keyspaces will still be restricted, and creating them will only be possible after enabling the configuration option `rf_rack_valid_keyspaces`. That's what we do in this PR. In this patch, we adjust existing tests in the tree to work with the new restriction. That shouldn't have been necessary because we've already seemingly adjusted all of them to work with the configuration option, but some tests hid well. We fix that mistake now. After that, we introduce the new restriction. What's more, when starting Scylla, we verify that there is no materialized view that would violate the contract. If there are some that do, we list them, notify the user, and refuse to start. High-level implementation strategy: 1. Name the restrictions in form of a function. 2. Adjust existing tests. 3. Restrict materialized views by both the experimental feature and the configuration option. Add validation test. 4. Drop the requirement for the experimental feature. Adjust the added test and add a new one. 5. Update the user documentation. Fixes scylladb/scylladb#23030 Backport: 2025.4, as we are aiming to support materialized views for tablets from that version. Closes scylladb/scylladb#25802 * github.com:scylladb/scylladb: view: Stop requiring experimental feature db/view: Verify valid configuration for tablet-based views db/view: Require rf_rack_valid_keyspaces when creating view test/cluster/random_failures: Skip creating secondary indexes test/cluster/mv: Mark test_mv_rf_change as skipped test/cluster: Adjust MV tests to RF-rack-validity test/boost/schema_loader_test.cc: Explicitly enable rf_rack_valid_keyspaces db/view: Name requirement for views with tablets	2025-10-06 12:46:46 +02:00
Dawid Mędrek	b409e85c20	view: Stop requiring experimental feature We modify the requirements for using materialized views in tablet-based keyspaces. Before, it was necessary to enable the configuration option `rf_rack_valid_keyspaces`, having the cluster feature `VIEWS_WITH_TABLETS` enabled, and using the experimental feature `views-with-tablets`. We drop the last requirement. We adjust code to that change and provide a new validation test. We also update the user documentation to reflect the changes. Fixes scylladb/scylladb#23030	2025-10-01 09:01:53 +02:00
Dawid Mędrek	00222070cd	db/view: Require rf_rack_valid_keyspaces when creating view We extend the requirements for being able to create materialized views and secondary indexes in tablet-based keyspaces. It's now necessary to enable the configuration option `rf_rack_valid_keyspaces`. This is a stepping stone towards bringing materialized views and secondary indexes with tablets out of the experimental phase. We add a validation test to verify the changes. Refs scylladb/scylladb#23030	2025-10-01 09:01:50 +02:00
Dawid Mędrek	6322b5996d	test/cluster/mv: Mark test_mv_rf_change as skipped The test will not work with `rf_rack_valid_keyspaces`. Since the option is going to become a requirement for using views with tablets, the test will need to be rewritten to take that into consideration. Since that adjustment doesn't seem trivial, we mark the test as skipped for the time being.	2025-10-01 09:01:29 +02:00
Dawid Mędrek	994f09530f	test/cluster: Adjust MV tests to RF-rack-validity Some of the new tests covering materialized views explicitly disabled the configuration option `rf_rack_valid_keyspaces`. It's going to become a new requirement for views with tablets, so we adjust those tests and enable the option. There is one exception, the test: `cluster/mv/test_mv_topology_change.py::test_mv_rf_change` We handle it separately in the following commit.	2025-09-30 20:01:25 +02:00
Michael Litvak	d94c1f6674	test: mv: test view update during topology operations add new test cases checking view consistency when writing to a table with MV and generating view updates while data is migrated. one case has tablet migrations while writing to the table. The other case does the equivalent for vnode keyspaces - it adds a new node. The tests reproduce issue scylladb/scylladb#24292	2025-09-29 13:44:04 +02:00
Wojciech Mitros	d9b8278178	mv: handle mismatched base/view replica count caused by RF change During an ALTER KEYSPACE statement execution where a table with a view is present, we need to perform tablet migrations for both tables. These migrations are not synchronized, so at some point the base may have a different number of non-pending replicas than the view. Because of that, we can't pair them correctly. If there is more non-pending base replicas than view replicas, we don't need to do anything because the view replica that didn't finish migrating is a pending replica and will get view updates from all base replicas. But if there is more non-pending view replicas than base replicas, we may currently lose view updates to the new view replica. This patch adds a workaround for this scenario. If after one migration we have too more non-pending view replicas than base replicas, we add it to the pending replica list so that it gets an update anyway. This patch will also take effect if the base and view replica counts differ due to some other bug. To track that, a new metric is added to count such occurrences. This patch also includes a test for this exact scenario, which is enforced by an injection. Fixes https://github.com/scylladb/scylladb/issues/21492	2025-09-22 12:50:16 +02:00
Michael Litvak	3dffb8e0dc	test: mv: add a test for view build interrupt during registration Add a new test that reproduces issue #22989. The test starts view building and interrupts it by restarting the node while some shards registered their status and some didn't.	2025-09-21 10:39:30 +02:00
Wojciech Mitros	f17beba834	load_balancer: include dead nodes when calculating rack load Load balancer aims to preserve a balance in rack loads when generating tablet migrations. However, this balance might get broken when dead nodes are present. Currently, these nodes aren't include in rack load calculations, even if they own tablet replicas. As a result, load balancer treats racks with dead nodes as racks with a lower load, so I generates migrations to these racks. This is incorrect, because a dead node might come back alive, which would result in having multiple tablet replicas on the same rack. It's also inefficient even if we know that the node won't come back - when it's being replaced or removed. In that case we know we are going to rebuild the lost tablet replicas so migrating tablets to this rack just doubles the work. Allowing such migrations to happen would also require adjustments in the materialized view pairing code because we'd temporarily allow having multiple tablet replicas on the same rack. So in this patch we include dead nodes when calculating rack loads in the load balancer. The dead nodes still aren't treated as potential migration sources or destinations. We also add a test which verifies that no migrations are performed by doing a node replace with a mv workload in parallel. Before the patch, we'd get pairing errors and after the patch, no pairing errors are detected. Fixes https://github.com/scylladb/scylladb/issues/24485 Closes scylladb/scylladb#26028	2025-09-17 20:49:18 +02:00
Michał Jadwiszczak	cf138da853	test: adjust existing tests - Disable tablets in `test_migration_on_existing_raft_topology`. Because views on tablets are experimental now, we can safely assume that view building coordinator will start with view build status on raft. - Add error injection to pause view building on worker. Used to pause view building process, there is analogous error injection in view_builder. - Do a read barrier in `test_view_in_system_tables` Increases test stability by making sure that the node sees up-to-date group0 state and `system.built_views` is synced. - Wait for view is build in some tests Increases tests stability by making sure that the view is built. - Remove xfail marker from `test_tablet_streaming_with_unbuilt_view` This series fix https://github.com/scylladb/scylladb/issues/21564 and this test should work now.	2025-08-27 10:23:04 +02:00

1 2

69 Commits