scylladb

Files

Nadav Har'El 8ba595e472 Merge 'alternator: fix batch writes during intranode tablet migrations' from Petr Gusev

Scylla implements `LWT` in the` storage_proxy::cas` method. This method expects to be called on a specific shard, represented by the `cas_shard` parameter. Clients must create this object before calling `storage_proxy::cas`, check its `this_shard()` method, and jump to `cas_shard.shard()` if it returns false.

The nuance is that by the time the request reaches the destination shard, the tablet may have already advanced in its migration state machine. For example, a client may acquire a `cas_shard` at the `streaming` tablet state, then submit a request to another shard via `smp::submit_to(cas_shard.shard())`. However, the new `cas_shard` created on that other shard might already be in the `write_both_read_new` state, and its `cas_shard.shard()` would not be equal to `this_shard_id()`. Such broken invariant results in an `on_internal_error` in `storage_proxy::cas`.

Clients of `storage_proxy::cas` are expected to check` cas_shard.this_shard()` and recursively jump to another shard if it returns false. Most calls to `storage_proxy::cas` already implement this logic. The only exception is `executor::do_batch_write`, which currently checks `cas_shard.this_shard()` only once. This can break the invariant if the tablet state changes more than once during the operation.

This PR fixes the issue by implementing recursive `cas_shard.this_shard()` checks in `executor::do_batch_write`. It also adds a test that reproduces the problem.

Fixes: scylladb/scylladb#27353

backport: need to be backported to 2025.4

Closes scylladb/scylladb#27396

* github.com:scylladb/scylladb:
  alternator/executor.cc: eliminate redundant dk copy
  alternator/executor.cc: release cas_shard on the original shard
  alternator/executor.cc: move shard check into cas_write
  alternator/executor.cc: make cas_write a private method
  alternator/executor.cc: make do_batch_write a private method
  alternator/executor.cc: fix indent
  test_alternator: add test_alternator_invalid_shard_for_lwt

2025-12-09 11:25:15 +02:00

auth_cluster

test: use ManagerClient in wait_until_driver_service_level_created

2025-11-17 14:55:14 +01:00

dtest

Revert "Merge 'db/config: enable ms sstable format by default' from Michał Chojnowski"

2025-12-02 14:38:56 +02:00

lwt

fix(test): minor typo fix, removing redundant param from logging

2025-11-10 08:42:11 +03:00

test/cluster/mv: Rewrite test_view_building_scheduling_group

2025-12-08 14:24:25 +02:00

object_store

test: Reuse S3 fixtures facilities in cqlpy/test_tools.py

2025-12-03 16:32:54 +02:00

random_failures

test/cluster/random_failures: Re-enable index events

2025-10-28 14:17:14 +01:00

tasks

test: Fix drain api in task_manager_client.py

2025-08-11 10:10:07 +08:00

__init__.py

…

conftest.py

test.py: switch to ThreadPoolExecutor

2025-12-07 17:37:25 +02:00

suite.yaml

test: dtest: limits_test.py: make the tests work

2025-09-29 12:39:53 +02:00

test_aggregation.py

…

test_alternator.py

alternator/executor.cc: move shard check into cas_write

2025-12-09 10:21:01 +01:00

test_automatic_cleanup.py

test_automatic_cleanup: fix comment

2025-10-28 17:55:20 +01:00

test_bad_initial_token.py

…

test_batchlog_manager.py

test: extend test_batchlog_replay_failure_during_repair

2025-11-14 14:18:07 +01:00

test_blocked_bootstrap.py

…

test_boot_after_ip_change.py

…

test_boot_nodes.py

test: Add test_boot_nodes.py

2025-07-10 10:56:53 +08:00

test_bti_index.py

test/cluster/test_bti_index.py: avoid a race with CQL tracing

2025-10-09 13:22:06 +03:00

test_cdc_generation_clearing.py

test_cdc_generation_clearing: wait for generations to propagate

2025-06-09 12:59:04 +02:00

test_cdc_generation_data.py

raft_group0: split shutdown into abort_and_drain and destroy

2025-07-25 17:16:14 +02:00

test_cdc_generation_publishing.py

test_cdc_generation_publishing: fix to read monotonically

2025-05-30 08:35:56 +02:00

test_cdc_with_alter.py

test: test concurrent writes with column drop with cdc preimage

2025-11-13 17:00:08 +01:00

test_cdc_with_tablets.py

test: cdc: extend cdc with tablets tests

2025-10-28 15:06:21 +01:00

test_change_ip.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_change_replication_factor_1_to_0.py

test: cluster: deflake consistency checks after decommission

2025-09-09 19:01:12 +02:00

test_change_rpc_address.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_cluster_features.py

…

test_commitlog_segment_data_resurrection.py

…

test_commitlog.py

…

test_concurrent_schema.py

…

test_config_live_updates.py

test: add test for live updates of generic server config

2025-06-23 17:56:26 +02:00

test_config.py

…

test_conflicting_keys_read_repair.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_coordinator_queue_management.py

test: fix test_coordinator_queue_management flakiness

2025-12-04 11:06:20 +02:00

test_counters_with_tablets.py

test: add counters with tablets test

2025-11-03 16:04:37 +01:00

test_crash_coordinator_before_streaming.py

…

test_data_resurrection_after_cleanup.py

test: cluster: deflake consistency checks after decommission

2025-09-09 19:01:12 +02:00

test_data_resurrection_in_memtable.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_decommission.py

test: cluster: deflake consistency checks after decommission

2025-09-09 19:01:12 +02:00

test_deprecating_cluster_features.py

…

test_describe.py

cql3: Represent create_statement using managed_string

2025-07-01 12:58:02 +02:00

test_different_group0_ids.py

Revert "db/config: don't use RBNO for scaling"

2025-11-18 08:17:17 +02:00

test_encryption.py

test::cluster::test_encryption: Port dtest EAR tests

2025-10-22 14:06:30 +00:00

test_error_becoming_voter.py

…

test_fencing.py

test: enable counters tests with tablets

2025-11-03 16:04:37 +01:00

test_global_ignore_nodes.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_gossip_boot.py

…

test_gossiper_empty_self_id_on_shadow_round.py

gossiper: fix empty initial local node state

2025-09-08 11:38:31 +02:00

test_gossiper_orphan_remover.py

…

test_gossiper_race.py

gossiper: check for a race condition in do_apply_state_locally

2025-09-08 11:38:30 +02:00

test_gossiper.py

…

test_group0_schema_versioning.py

test: Create cluster with multiple racks in multi-dc setups

2025-10-29 23:32:57 +01:00

test_hints.py

test.py: rewrite the wait_for_first_completed

2025-10-22 01:13:43 +03:00

test_incremental_repair.py

repair: Fix deadlock when topology coordinator steps down in the middle

2025-11-28 15:14:39 +01:00

test_initial_token.py

test/cluster/conftest: cluster_con: provide default values for port and use_ssl

2025-08-22 09:51:24 +03:00

test_ip_mappings.py

test: make test_broken_bootstrap faster

2025-12-09 09:25:42 +02:00

test_keyspace_rf.py

test: Generalize tests to work with both numeric RF and rack lists

2025-10-29 23:32:58 +01:00

test_long_join.py

test: improve async execution in test_long_join

2025-09-08 17:14:37 +02:00

test_long_query_timeout_erm.py

test.py: rewrite the wait_for_first_completed

2025-10-22 01:13:43 +03:00

test_lwt_semaphore.py

…

test_maintenance_mode.py

test/cluster/test_maintenance_mode.py: Wait for initialization

2025-11-13 11:07:45 +01:00

test_major_compaction.py

replica/table: do not stop major compaction when disabling auto compaction

2025-10-29 19:22:07 +05:30

test_metadata_id.py

test/cluster/conftest: cluster_con: provide default values for port and use_ssl

2025-08-22 09:51:24 +03:00

test_multidc.py

cql3: ks_prop_defs: Expand numeric RF to rack list

2025-10-29 23:32:59 +01:00

test_mutation_schema_change.py

…

test_mv.py

tombstone_gc: don't use 'repair' mode for colocated tables

2025-11-25 09:15:46 +01:00

test_no_dc_rack_change.py

…

test_no_removed_node_event_on_ip_change.py

test/cluster/conftest: cluster_con: provide default values for port and use_ssl

2025-08-22 09:51:24 +03:00

test_node_isolation.py

topology: let banned node know that it is banned

2025-11-24 17:12:13 +01:00

test_node_ops_metrics.py

test/pylib/rest_client: fix ScyllaMetrics filtering

2025-08-10 10:16:00 +02:00

test_node_shutdown_waits_for_pending_requests.py

…

test_nodetool.py

nodetool: status: Show excluded nodes as having status 'X'

2025-10-31 09:03:20 +01:00

test_not_enough_token_owners.py

tablets: scheduler: Balance racks separately when rf_rack_valid_keyspaces is true

2025-09-23 00:30:37 +02:00

test_query_rebounce.py

…

test_raft_cluster_features.py

test/cluster: Add test_simulate_upgrade_legacy_to_raft_listener_registration

2025-10-28 17:32:15 +01:00

test_raft_fix_broken_snapshot.py

…

test_raft_ignore_nodes.py

…

test_raft_no_quorum.py

test/pylib: allow expected_error in server_start to contain regular expression

2025-12-04 11:06:20 +02:00

test_raft_recovery_basic.py

…

test_raft_recovery_during_join.py

test: assert that majority is lost in some tests of the recovery procedure

2025-10-07 17:48:55 +02:00

test_raft_recovery_entry_loss.py

test: unskip test_raft_recovery_entry_loss

2025-10-24 21:23:41 +03:00

test_raft_recovery_majority_loss.py

…

test_raft_recovery_stuck.py

test: test_raft_recovery_stuck: ensure mutual visibility before using driver

2025-11-19 05:54:12 +01:00

test_raft_recovery_user_data.py

test: assert that majority is lost in some tests of the recovery procedure

2025-10-07 17:48:55 +02:00

test_raft_snapshot_request.py

…

test_raft_snapshot_truncation.py

…

test_raft_voters.py

test/cluster/conftest: cluster_con: provide default values for port and use_ssl

2025-08-22 09:51:24 +03:00

test_random_tables.py

…

test_read_repair.py

test/cluster/test_read_repair: write 100 rows in trace test

2025-06-27 16:23:08 +03:00

test_refresh.py

Add nodetool refresh --scope option

2025-05-29 16:12:09 +03:00

test_remove_alive_node.py

…

test_remove_rpc_client_with_pending_requests.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_repair.py

repair: Allow min max range to be updated for repair history

2025-12-05 10:41:25 +02:00

test_replace_alive_node.py

…

test_replace_ignore_nodes.py

…

test_replace_with_encryption.py

…

test_replace_with_same_ip_twice.py

…

test_replace.py

test.py: rework log_browsing for dtest migration

2025-05-19 11:50:55 +00:00

test_replica_exceptions.py

test: enable counters tests with tablets

2025-11-03 16:04:37 +01:00

test_rest_api_on_startup.py

test: add test_rest_api_on_startup

2025-12-03 15:35:59 +01:00

test_restart_cluster.py

…

test_resurrection.py

…

test_reversed_queries_during_simulated_upgrade_process.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_rpc_compression.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_select_from_mutation_fragments.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_shutdown_hang.py

…

test_snapshot.py

test: add type creation to test_snapshot

2025-07-10 10:46:55 +02:00

test_sstable_cleanup_stop.py

compaction: Fix stop of sstable cleanup

2025-09-11 08:55:10 +03:00

test_sstable_compression_config.py

test/cluster: Add test for default SSTable compressor

2025-10-30 15:53:54 +02:00

test_sstable_compression_dictionaries_autotrain.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_sstable_compression_dictionaries_basic.py

db/config: Deprecate sstable_compression_dictionaries_allow_in_ddl

2025-10-29 20:13:08 +02:00

test_sstable_compression_dictionaries_upgrade.py

pylib: extract upgrade helpers from test_sstable_compression_dictionaries_upgrade.py

2025-09-15 12:34:45 +02:00

test_sstable_set.py

…

test_start_bootstrapped_with_invalid_seed.py

…

test_streaming_deadlock.py

test: limit test_streaming_deadlock_removenode concurrency

2025-09-19 12:50:20 +03:00

test_table_desc_read_barrier.py

…

test_table_drop.py

…

test_tablet_repair_scheduler.py

repair: Add tablet repair progress report support

2025-12-08 13:35:19 +02:00

test_tablet_stats.py

topology_coordinator: Make tablet_load_stats_refresh_interval configurable

2025-07-31 14:31:55 +03:00

test_tablets2.py

test: fix flakyness caused by TRUNCATE retries

2025-12-08 14:13:26 +02:00

test_tablets_colocation.py

test: fix test flakiness in test_colocated_tables_gc_mode

2025-12-03 12:12:24 +01:00

test_tablets_cql.py

test: Switch to rack-list based RF

2025-10-29 23:32:58 +01:00

test_tablets_intranode.py

…

test_tablets_lwt.py

test_tablets_lwt: add test_tablets_merge_waits_for_lwt

2025-10-22 11:33:20 +02:00

test_tablets_merge.py

test_tablets_merge: test_tablet_split_merge_with_many_tables: reduce number of tables in debug mode

2025-09-29 15:30:13 +03:00

test_tablets_migration.py

topology_coordinator: Add barrier to cleanup_target

2025-12-03 16:19:17 +01:00

test_tablets_removenode.py

test/cluster: Disable rf_rack_valid_keyspaces in problematic tests

2025-05-10 16:30:49 +02:00

test_tablets.py

cql3: reject ALTER KEYSPACE if rf of datacenter with tablets is omitted

2025-11-24 06:36:51 +02:00

test_tls.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_tombstone_gc.py

Merge 'raft topology: fix group0 tombstone GC in the Raft-based recovery procedure' from Patryk Jędrzejczak

2025-10-22 16:40:11 +03:00

test_topology_failure_recovery.py

…

test_topology_ops_encrypted.py

test: cluster: deflake consistency checks after decommission

2025-09-09 19:01:12 +02:00

test_topology_ops.py

test: cluster: deflake consistency checks after decommission

2025-09-09 19:01:12 +02:00

test_topology_recovery_basic.py

test.py: apply the nightly label on test_topology_recovery_basic

2025-09-01 14:16:29 +02:00

test_topology_recovery_majority_loss.py

test/cluster/conftest: cluster_con: provide default values for port and use_ssl

2025-08-22 09:51:24 +03:00

test_topology_rejoin.py

…

test_topology_remove_decom.py

raft topology: skip non-idempotent steps in decommission path to avoid problems during races

2025-11-07 10:07:49 +01:00

test_topology_remove_garbage_group0.py

…

test_topology_schema.py

…

test_topology_smp.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_topology_upgrade_not_stuck_after_recent_removal.py

…

test_topology_upgrade_stuck.py

test.py: rewrite the wait_for_first_completed

2025-10-22 01:13:43 +03:00

test_topology_upgrade.py

…

test_truncate_concurrent_writes.py

truncate: add test for truncate with concurrent writes

2025-08-05 13:54:14 +02:00

test_truncate_with_drop.py

system_keyspace: Prune dropped tables from truncation on start/drop

2025-09-03 07:25:34 +03:00

test_truncate_with_tablets.py

topology coordinator: allow running multiple global commands in parallel

2025-06-11 11:29:33 +03:00

test_unfinished_writes_during_shutdown.py

storage_service: Cancel all write requests on storage_proxy shutdown

2025-07-22 15:03:30 +02:00

test_vector_store.py

index: allow vector indexes without rf_rack_valid_keyspces

2025-12-05 09:26:26 +02:00

test_view_build_status.py

test/cluster: add view build status tests

2025-08-27 10:23:04 +02:00

test_view_building_coordinator.py

db/view/view_building_coordinator: skip work if no view is built

2025-12-03 09:44:28 +02:00

test_write_query_during_cql_server_shutdown.py

generic_server: Two-step connection shutdown.

2025-07-28 10:08:06 +02:00

test_writes_to_previous_cdc_generations.py

…

test_zero_token_nodes_multidc.py

test: cluster: test_zero_token_nodes_multidc: Adjust to rack list RF

2025-10-29 23:32:58 +01:00

test_zero_token_nodes_no_replication.py

test/cluster/conftest: cluster_con: provide default values for port and use_ssl

2025-08-22 09:51:24 +03:00

test_zero_token_nodes_topology_ops.py

test/cluster/test_zero_token_nodes_topology_ops: Adjust to RF-rack-validity

2025-05-10 16:30:34 +02:00

util.py

test: wait for read_barrier in wait_until_driver_service_level_created

2025-11-17 15:21:28 +01:00