mirror of
https://github.com/scylladb/scylladb.git
synced 2026-04-24 18:40:38 +00:00
all_sibling_tablet_replicas_colocated was using committed ti.replicas to decide whether sibling tablets are co-located and merge can be finalized. This caused a false non-co-located window when a co-located pair was moved by the load balancer: as both tablets migrate together, their del_transition commits may land in different Raft rounds. After the first commit, ti.replicas diverge temporarily (one tablet shows the new position, the other the old), causing all_sibling_tablet_replicas_colocated to return false. This clears finalize_resize, allowing the load balancer to start new cascading migrations that delay merge finalization by tens of seconds. Fix this by using the optimistic replica view (trinfo->next when transitioning, ti.replicas otherwise) — the same view the load balancer uses for load accounting — so finalize_resize stays populated throughout an in-flight migration and no spurious cascades are triggered. Steps that lead to the problem: 1. Merge is triggered. The load balancer generates co-location migrations for all sibling pairs that are not yet on the same shard. Some pairs finish co-location before others. 2. Once all pairs are co-located in committed state, all_sibling_tablet_replicas_colocated returns true and finalize_resize is set. Meanwhile the load balancer may have already started a regular LB migration on one co-located pair (both tablets are stable and the load balancer is free to move them). 3. The LB migration moves both tablets together (colocated_tablets). Their two del_transition commits land in separate Raft rounds. After the first commit, ti.replicas[t1] = new position but ti.replicas[t2] = old position. 4. In this window, all_sibling_tablet_replicas_colocated sees the pair as NOT co-located, clears finalize_resize, and the load balancer generates new migrations for other tablets to rebalance the load that the pair move created. 5. Those new migrations can take tens of seconds to stream, keeping the coordinator in handle_tablet_migration mode and preventing maybe_start_tablet_resize_finalization from being called. The merge finalization is delayed until all those cascaded migrations complete. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-821. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1459. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Closes scylladb/scylladb#29465