scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-31 03:56:42 +00:00

Files

Nadav Har'El 2431f92967 alternator, test: add reproducer for issue about immediate LWT timeout

This patch adds a reproducer for issue #16261, where it was reported
that when Alternator read-modify-write (using LWT) operations to the
same partition are sent to different nodes, sometimes the operation
fails immediately, with an InternalServerError claiming to be a "timeout",
although this happens almost immediately (after a few milliseconds),
not after any real timeout.

The test uses 3 nodes, and 3 threads which send RMW operations to different
items in the same partition, and usually (though not with 100% certainty)
it reaches the InternalServerError in around 100 writes by each thread.
This InternalServerError looks like:

    Internal server error: exceptions::mutation_write_timeout_exception
    (Operation timed out for alternator_alternator_Test_1719157066704.alternator_Test_1719157066704 - received only 1 responses from 2 CL=LOCAL_SERIAL.)

The test also prints how much time it took for the request to fail,
for example:
    In incrementing 1,0 on node 1: error after 0.017074108123779297
This is 0.017 seconds - it's not the cas_contention_timeout_in_ms
timeout (1 second) or any other timeout.

If we enable trace logging, adding to topology_experimental_raft/suite.yaml
    extra_scylla_cmdline_options: ["--logger-log-level", "paxos=trace"]
we get the following TRACE-level message in the log:

    paxos - CAS[0] accept_proposal: proposal is partially rejected

This again shows the problem is "uncertainty" (partial rejection) and not
a timeout.

Refs #16261

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#19445

2025-08-01 11:58:52 +03:00

auth_cluster

Merge 'qos: don't populate effective service level cache until auth is migrated to raft' from Piotr Dulikowski

2025-07-31 13:05:27 +03:00

dtest

test: audit: add cassandra user test case

2025-07-21 14:54:20 +02:00

topology_coordinator: Make tablet_load_stats_refresh_interval configurable

2025-07-31 14:31:55 +03:00

object_store

sstables: Start using make_data_or_index_source in sstable

2025-07-15 10:10:23 +03:00

random_failures

random_failures: enable execute_lwt_transaction

2025-07-24 19:48:09 +02:00

tasks

topology_coordinator: Make tablet_load_stats_refresh_interval configurable

2025-07-31 14:31:55 +03:00

__init__.py

…

conftest.py

test.py: add --run-internet-dependent-tests

2025-06-02 15:49:29 +02:00

suite.yaml

test.py: dtest: make auth_test.py run using test.py

2025-06-30 10:16:36 +00:00

test_aggregation.py

test/pylib: servers_add: add auto_rack_dc parameter

2025-03-30 19:23:40 +03:00

test_alternator.py

alternator, test: add reproducer for issue about immediate LWT timeout

2025-08-01 11:58:52 +03:00

test_automatic_cleanup.py

test/pylib: servers_add: add auto_rack_dc parameter

2025-03-30 19:23:40 +03:00

test_bad_initial_token.py

test: cluster: add test_bad_initial_token

2025-04-25 12:25:15 +02:00

test_batchlog_manager.py

test: test_batchlog_manager: batchlog replay includes cdc

2025-07-07 12:24:05 +03:00

test_blocked_bootstrap.py

…

test_boot_after_ip_change.py

db/config: add tablets_mode_for_new_keyspaces option

2025-03-24 14:54:45 +02:00

test_boot_nodes.py

test: Add test_boot_nodes.py

2025-07-10 10:56:53 +08:00

test_cdc_generation_clearing.py

test_cdc_generation_clearing: wait for generations to propagate

2025-06-09 12:59:04 +02:00

test_cdc_generation_data.py

raft_group0: split shutdown into abort_and_drain and destroy

2025-07-25 17:16:14 +02:00

test_cdc_generation_publishing.py

test_cdc_generation_publishing: fix to read monotonically

2025-05-30 08:35:56 +02:00

test_cdc_with_alter.py

test: cdc: add test_cdc_with_alter

2025-07-17 17:16:17 +02:00

test_change_ip.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_change_replication_factor_1_to_0.py

db/config: add tablets_mode_for_new_keyspaces option

2025-03-24 14:54:45 +02:00

test_change_rpc_address.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_cluster_features.py

test/pylib: servers_add: add auto_rack_dc parameter

2025-03-30 19:23:40 +03:00

test_commitlog_segment_data_resurrection.py

…

test_commitlog.py

…

test_compacting_reader_tombstone_gc.py

…

test_concurrent_schema.py

test/pylib: servers_add: add auto_rack_dc parameter

2025-03-30 19:23:40 +03:00

test_config_live_updates.py

test: add test for live updates of generic server config

2025-06-23 17:56:26 +02:00

test_config.py

…

test_conflicting_keys_read_repair.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_coordinator_queue_management.py

test.py: rework log_browsing for dtest migration

2025-05-19 11:50:55 +00:00

test_crash_coordinator_before_streaming.py

…

test_data_resurrection_after_cleanup.py

…

test_data_resurrection_in_memtable.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_decommission.py

…

test_deprecating_cluster_features.py

…

test_describe.py

cql3: Represent create_statement using managed_string

2025-07-01 12:58:02 +02:00

test_different_group0_ids.py

db/config: add tablets_mode_for_new_keyspaces option

2025-03-24 14:54:45 +02:00

test_encryption.py

db/config: add tablets_mode_for_new_keyspaces option

2025-03-24 14:54:45 +02:00

test_error_becoming_voter.py

…

test_fencing.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_global_ignore_nodes.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_gossip_boot.py

db/config: add tablets_mode_for_new_keyspaces option

2025-03-24 14:54:45 +02:00

test_gossiper_orphan_remover.py

gossiper: move force_remove_endpoint to work on host id

2025-04-06 18:39:24 +03:00

test_gossiper.py

test/pylib: servers_add: add auto_rack_dc parameter

2025-03-30 19:23:40 +03:00

test_group0_schema_versioning.py

db/config: add tablets_mode_for_new_keyspaces option

2025-03-24 14:54:45 +02:00

test_hints.py

test.py: rework log_browsing for dtest migration

2025-05-19 11:50:55 +00:00

test_initial_token.py

…

test_ip_mappings.py

…

test_long_join.py

…

test_long_query_timeout_erm.py

test: Set request_timeout_on_shutdown_in_seconds to request_timeout_in_ms,

2025-07-29 15:37:47 +02:00

test_lwt_semaphore.py

…

test_maintenance_mode.py

…

test_major_compaction.py

test.py: rework log_browsing for dtest migration

2025-05-19 11:50:55 +00:00

test_metadata_id.py

test: add tests for prepared statement metadata consistency corner cases

2025-05-14 09:59:19 +02:00

test_multidc.py

test/cluster/test_multidc.py: Adjust to RF-rack-validity

2025-05-10 16:30:23 +02:00

test_mutation_schema_change.py

test/pylib: servers_add: add auto_rack_dc parameter

2025-03-30 19:23:40 +03:00

test_mv.py

test/pylib: servers_add: add auto_rack_dc parameter

2025-03-30 19:23:40 +03:00

test_no_dc_rack_change.py

test: cluster: introduce test_no_dc_rack_change

2025-04-17 16:22:58 +02:00

test_no_removed_node_event_on_ip_change.py

…

test_node_isolation.py

…

test_node_ops_metrics.py

…

test_node_shutdown_waits_for_pending_requests.py

…

test_nodetool.py

…

test_not_enough_token_owners.py

test/cluster/test_not_enough_token_owners.py: Adjust to RF-rack-validity

2025-05-10 16:30:26 +02:00

test_query_rebounce.py

…

test_raft_cluster_features.py

test/pylib: servers_add: add auto_rack_dc parameter

2025-03-30 19:23:40 +03:00

test_raft_fix_broken_snapshot.py

db/config: add tablets_mode_for_new_keyspaces option

2025-03-24 14:54:45 +02:00

test_raft_ignore_nodes.py

…

test_raft_no_quorum.py

raft: make group0 Raft operation timeout configurable

2025-04-15 10:57:39 +03:00

test_raft_recovery_basic.py

db/config: add tablets_mode_for_new_keyspaces option

2025-03-24 14:54:45 +02:00

test_raft_recovery_during_join.py

test: add tests for the Raft-based recovery procedure

2025-03-14 13:53:05 +01:00

test_raft_recovery_entry_loss.py

test: add tests for the Raft-based recovery procedure

2025-03-14 13:53:05 +01:00

test_raft_recovery_majority_loss.py

db/config: add tablets_mode_for_new_keyspaces option

2025-03-24 14:54:45 +02:00

test_raft_recovery_stuck.py

db/config: add tablets_mode_for_new_keyspaces option

2025-03-24 14:54:45 +02:00

test_raft_recovery_user_data.py

test: test_raft_recovery_user_data: disable hinted handoff

2025-06-03 17:48:42 +02:00

test_raft_snapshot_request.py

…

test_raft_snapshot_truncation.py

…

test_raft_voters.py

test.py: rework log_browsing for dtest migration

2025-05-19 11:50:55 +00:00

test_random_tables.py

test/pylib: servers_add: add auto_rack_dc parameter

2025-03-30 19:23:40 +03:00

test_read_repair.py

test/cluster/test_read_repair: write 100 rows in trace test

2025-06-27 16:23:08 +03:00

test_refresh.py

Add nodetool refresh --scope option

2025-05-29 16:12:09 +03:00

test_remove_alive_node.py

…

test_remove_rpc_client_with_pending_requests.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_repair.py

api: repair_async: forbid repairing tablet keyspaces

2025-07-24 11:11:09 +02:00

test_replace_alive_node.py

test/pylib: servers_add: add auto_rack_dc parameter

2025-03-30 19:23:40 +03:00

test_replace_ignore_nodes.py

db/config: add tablets_mode_for_new_keyspaces option

2025-03-24 14:54:45 +02:00

test_replace_with_encryption.py

…

test_replace_with_same_ip_twice.py

…

test_replace.py

test.py: rework log_browsing for dtest migration

2025-05-19 11:50:55 +00:00

test_restart_cluster.py

…

test_resurrection.py

test: port of test and reproducer for resurrection during file based streaming

2025-03-30 13:39:40 +03:00

test_reversed_queries_during_simulated_upgrade_process.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_rpc_compression.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_select_from_mutation_fragments.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_shutdown_hang.py

…

test_snapshot.py

test: add type creation to test_snapshot

2025-07-10 10:46:55 +02:00

test_sstable_compression_dictionaries_autotrain.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_sstable_compression_dictionaries_basic.py

test_sstable_compression_dictionaries_basic.py: fix a flaky check

2025-06-25 11:30:28 +03:00

test_sstable_compression_dictionaries_upgrade.py

test: add test_sstable_compression_dictionaries_upgrade.py

2025-06-02 15:49:29 +02:00

test_sstable_set.py

test: Verify partitioned set store split and unsplit correctly

2025-04-29 15:47:33 -03:00

test_start_bootstrapped_with_invalid_seed.py

…

test_streaming_deadlock.py

streaming: Avoid deadlock by running view checks in a separate scheduling group

2025-07-11 16:30:46 +02:00

test_table_desc_read_barrier.py

…

test_table_drop.py

test: test table drop during flush

2025-04-23 14:29:28 +02:00

test_tablet_repair_scheduler.py

repair: Avoid too many fragments in a single repair_row_on_wire

2025-07-29 13:43:53 +08:00

test_tablet_stats.py

topology_coordinator: Make tablet_load_stats_refresh_interval configurable

2025-07-31 14:31:55 +03:00

test_tablets2.py

topology_coordinator: Make tablet_load_stats_refresh_interval configurable

2025-07-31 14:31:55 +03:00

test_tablets_colocation.py

topology_coordinator: Make tablet_load_stats_refresh_interval configurable

2025-07-31 14:31:55 +03:00

test_tablets_cql.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_tablets_intranode.py

…

test_tablets_lwt.py

test_tablets_lwt: add test_paxos_state_table_permissions

2025-07-24 19:48:09 +02:00

test_tablets_merge.py

topology_coordinator: Make tablet_load_stats_refresh_interval configurable

2025-07-31 14:31:55 +03:00

test_tablets_migration.py

topology_coordinator: Make tablet_load_stats_refresh_interval configurable

2025-07-31 14:31:55 +03:00

test_tablets_removenode.py

test/cluster: Disable rf_rack_valid_keyspaces in problematic tests

2025-05-10 16:30:49 +02:00

test_tablets.py

topology_coordinator: Make tablet_load_stats_refresh_interval configurable

2025-07-31 14:31:55 +03:00

test_tls.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_tombstone_gc.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_topology_failure_recovery.py

db/config: add tablets_mode_for_new_keyspaces option

2025-03-24 14:54:45 +02:00

test_topology_ops_encrypted.py

test/cluster: Disable rf_rack_valid_keyspaces in problematic tests

2025-05-10 16:30:49 +02:00

test_topology_ops.py

test/cluster: Disable rf_rack_valid_keyspaces in problematic tests

2025-05-10 16:30:49 +02:00

test_topology_recovery_basic.py

test: mark tests with the gossip-based recovery procedure

2025-03-14 13:53:05 +01:00

test_topology_recovery_majority_loss.py

…

test_topology_rejoin.py

test/pylib: servers_add: add auto_rack_dc parameter

2025-03-30 19:23:40 +03:00

test_topology_remove_decom.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_topology_remove_garbage_group0.py

db/config: add tablets_mode_for_new_keyspaces option

2025-03-24 14:54:45 +02:00

test_topology_schema.py

test/pylib: servers_add: add auto_rack_dc parameter

2025-03-30 19:23:40 +03:00

test_topology_smp.py

test/cluster: Adjust simple tests to RF-rack-validity

2025-05-10 16:30:18 +02:00

test_topology_upgrade_not_stuck_after_recent_removal.py

db/config: add tablets_mode_for_new_keyspaces option

2025-03-24 14:54:45 +02:00

test_topology_upgrade_stuck.py

db/config: add tablets_mode_for_new_keyspaces option

2025-03-24 14:54:45 +02:00

test_topology_upgrade.py

db/config: add tablets_mode_for_new_keyspaces option

2025-03-24 14:54:45 +02:00

test_truncate_with_tablets.py

topology coordinator: allow running multiple global commands in parallel

2025-06-11 11:29:33 +03:00

test_unfinished_writes_during_shutdown.py

storage_service: Cancel all write requests on storage_proxy shutdown

2025-07-22 15:03:30 +02:00

test_view_build_status.py

group0: modify start_operation logic to account for synchronize phase race condition

2025-06-24 10:04:39 +02:00

test_write_query_during_cql_server_shutdown.py

generic_server: Two-step connection shutdown.

2025-07-28 10:08:06 +02:00

test_writes_to_previous_cdc_generations.py

…

test_zero_token_nodes_multidc.py

test: test_zero_token_nodes_multidc: properly handle reads with CL=ONE

2025-07-15 07:14:09 +03:00

test_zero_token_nodes_no_replication.py

test/cluster/test_zero_token_nodes_no_replication.py: Adjust to RF-rack-validity

2025-05-10 16:30:31 +02:00

test_zero_token_nodes_topology_ops.py

test/cluster/test_zero_token_nodes_topology_ops: Adjust to RF-rack-validity

2025-05-10 16:30:34 +02:00

util.py

Fix regexp in check_node_log_for_failed_mutations

2025-06-25 12:00:16 +03:00