Files
scylladb/service
Botond Dénes a6367cc48c Merge '[Backport 2025.1] topology_coordinator: Add barrier to cleanup_target' from Scylladb[bot]
Consider the following scenario:
1. A table has RF=3 and writes use CL=QUORUM
2. One node is down
3. There is a pending tablet migration from the unavailable node
   that is reverted

During the revert, there can be a time window where the pending replica
being cleaned up still accepts writes. This leads to write failures,
as only two nodes (out of four) are able to acknowledge writes.

This patch fixes the issue by adding a barrier to the cleanup_target
tablet transition state, ensuring that the coordinator switches back to
the previous replica set before cleanup is triggered.

Fixes https://github.com/scylladb/scylladb/issues/26512

It's a pre existing issue. Backport is required to all recent 2025.x versions.

- (cherry picked from commit 669286b1d6)

- (cherry picked from commit 67f1c6d36c)

- (cherry picked from commit 6163fedd2e)

Parent PR: #27413

Closes scylladb/scylladb#27425

* github.com:scylladb/scylladb:
  topology_coordinator: Fix the indentation for the cleanup_target case
  topology_coordinator: Add barrier to cleanup_target
  test_node_failure_during_tablet_migration: Increase RF from 2 to 3
2026-01-08 18:09:37 +02:00
..