scylladb/locator at copilot/code-review-alternator-code - scylladb - Anomalous Gitea

mirrors/scylladb

Files

History

Tomasz Grabiec df949dc506 Merge 'topology_coordinator: make cleanup reliable on barrier failures' from Łukasz Paszkowski

Fix a subtle but damaging failure mode in the tablet migration state machine: when a barrier fails, the follow-up barrier is triggered asynchronously, and cleanup can get skipped for that iteration. On the next loop, the original failure may no longer be visible (because the failing node got excluded), so the tablet can incorrectly move forward instead of entering `cleanup_target`.

To make cleanup reliable this PR:

Adds an additional “fallback cleanup” stage

- `write_both_read_old_fallback_cleanup`

that does not modify read/write selectors. This stage is safe to enter immediately after a barrier failure, and it funnels the tablet into cleanup with the required barriers.

Avoids changing both read and write selectors in a single step transitioning from `write_both_read_new` to `cleanup_target`. The fallback path updates selectors in a safe order: read first, then write.

Allows a direct no-barrier transition from `allow_write_both_read_old` to `cleanup_target` after failure, because in that specific case `cleanup_target` doesn’t change selectors and the hop is safe.

No need for backport. It's an improvement. Currently, tablets transition to `cleanup_target` eventually via failed streaming.

Closes scylladb/scylladb#28169

* github.com:scylladb/scylladb:
  topology_coordinator: add write_both_read_old_fallback_cleanup state
  topology_coordinator: allow cleanup_target transition from streaming/rebuild_repair without barrier
  topology_coordinator: allow cleanup_target transition without barrier after failure in write_both_read_old
  topology_coordinator: allow cleanup_target transition without barrier after failure in allow_write_both_read_old

2026-01-28 13:33:39 +01:00

..

abstract_replication_strategy.cc

config: add enforce_rack_list option

2026-01-20 09:58:51 +01:00

abstract_replication_strategy.hh

config: add enforce_rack_list option

2026-01-20 09:58:51 +01:00

azure_snitch.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

azure_snitch.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

CMakeLists.txt

Add precompiled headers to CMakeLists.txt

2025-11-21 12:27:41 +02:00

ec2_multi_region_snitch.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

ec2_multi_region_snitch.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

ec2_snitch.cc

ec2_snitch: Fix indentation after previous patch

2025-10-28 19:31:08 +03:00

ec2_snitch.hh

ec2_snitch: Coroutinize the aws_api_call_once()

2025-10-28 19:29:25 +03:00

everywhere_replication_strategy.cc

locator: Pass topology to replication strategy constructor

2025-10-01 16:06:28 +02:00

everywhere_replication_strategy.hh

locator: Pass topology to replication strategy constructor

2025-10-01 16:06:28 +02:00

gce_snitch.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

gce_snitch.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

gossiping_property_file_snitch.cc

snitch: Reindent after previous changes

2025-09-25 18:59:48 +03:00

gossiping_property_file_snitch.hh

snitch: Make periodic_reader_callback() a coroutine

2025-09-25 18:59:48 +03:00

host_id.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

load_sketch.hh

load_sketch: Allow populating load_sketch with normalized current load

2026-01-07 11:49:01 +01:00

local_strategy.cc

locator: Pass topology to replication strategy constructor

2025-10-01 16:06:28 +02:00

local_strategy.hh

locator: Pass topology to replication strategy constructor

2025-10-01 16:06:28 +02:00

network_topology_strategy.cc

Merge 'strongly consistent tables: basic implementation' from Petr Gusev

2026-01-23 09:52:33 +01:00

network_topology_strategy.hh

locator: network_topology_strategy: Respect rack list when reallocating tablets

2025-10-02 19:42:39 +02:00

production_snitch_base.cc

locator/production_snitch_base: Reduce log level when property file incomplete

2025-05-13 13:59:39 +03:00

production_snitch_base.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

rack_inferring_snitch.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

rack_inferring_snitch.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

simple_snitch.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

simple_snitch.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

simple_strategy.cc

locator, cql3: Support rack lists in replication options

2025-10-02 19:42:39 +02:00

simple_strategy.hh

locator: Pass topology to replication strategy constructor

2025-10-01 16:06:28 +02:00

snitch_base.cc

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

snitch_base.hh

code: Replace distributed<> with sharded<>

2025-09-19 12:22:51 +02:00

tablet_metadata_guard.hh

token_metadata_guard: a topology guard for a token

2025-06-18 11:51:48 +02:00

tablet_replication_strategy.hh

tablet_replication_strategy: add consistency field

2026-01-21 14:56:00 +01:00

tablet_sharder.hh

mv: generate view updates on both shards in intranode migration

2025-09-29 13:44:04 +02:00

tablets.cc

Merge 'topology_coordinator: make cleanup reliable on barrier failures' from Łukasz Paszkowski

2026-01-28 13:33:39 +01:00

tablets.hh

Merge 'topology_coordinator: make cleanup reliable on barrier failures' from Łukasz Paszkowski

2026-01-28 13:33:39 +01:00

token_metadata_fwd.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

token_metadata.cc

storage_proxy: move update_fence_version from shared_token_metadata

2025-10-22 16:31:43 +02:00

token_metadata.hh

locator/token_metadata: Remove get_host_id()

2025-12-15 10:36:52 +01:00

token_range_splitter.hh

treewide: relicense to ScyllaDB-Source-Available-1.0

2024-12-18 17:45:13 +02:00

topology.cc

locator: topology: Add "draining" flag to a node

2026-01-18 15:36:04 +01:00

topology.hh

locator: topology: Add "draining" flag to a node

2026-01-18 15:36:04 +01:00

types.hh

locator: Make hasher for endpoint_dc_rack globally accessible

2025-10-02 19:45:00 +02:00

util.cc

locator: utils: get_all_ranges, construct_range_to_endpoint_map: use end-bound ranges

2025-08-20 15:15:40 +02:00

util.hh

locator: util: optimize describe_ring

2025-08-13 12:42:25 +03:00