From ba10314b16437cabfe5d69fbab54e88d10d96afe Mon Sep 17 00:00:00 2001 From: Emil Maskovsky Date: Tue, 17 Mar 2026 13:50:14 +0000 Subject: [PATCH] raft: abort stale snapshot transfers when term changes **The Bug** Assertion failure: `SCYLLA_ASSERT(res.second)` in `raft/server.cc` when creating a snapshot transfer for a destination that already had a stale in-flight transfer. **Root Cause** If a node loses leadership and later becomes leader again before the next `io_fiber` iteration, the old transfer from the previous term can remain in `_snapshot_transfers` while `become_leader()` resets progress state. When the new term emits `install_snapshot(dst)`, `send_snapshot(dst)` tries to create a new entry for the same destination and can hit the assertion. **The Fix** Abort all in-flight snapshot transfers in `process_fsm_output()` when `term_and_vote` is persisted. A term/vote change marks existing transfers as stale, so we clean them up before dispatching messages from that batch and before any new snapshot transfer is started. With cross-term cleanup moved to the term-change path, `send_snapshot()` now asserts the within-term invariant that there is at most one in-flight transfer per destination. Fixes: SCYLLADB-862 Backport: The issue is reproducible in master, but is present in all active branches. Closes scylladb/scylladb#29092 (cherry picked from commit 9dad68e58d3e0d4e339a1b97b8dc0612d1424ddf) Closes scylladb/scylladb#29264 Closes scylladb/scylladb#29357 --- raft/server.cc | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/raft/server.cc b/raft/server.cc index 188dd87d53..5d1a1a65d1 100644 --- a/raft/server.cc +++ b/raft/server.cc @@ -1102,6 +1102,18 @@ future<> server_impl::process_fsm_output(index_t& last_stable, fsm_output&& batc // case. co_await _persistence->store_term_and_vote(batch.term_and_vote->first, batch.term_and_vote->second); _stats.store_term_and_vote++; + + // When the term advances, any in-flight snapshot transfers + // belong to an outdated term: the progress tracker has been + // reset in become_leader() or we are now a follower. + // Abort them before we dispatch this batch's messages, which + // may start fresh transfers for the new term. + // + // A vote may also change independently of the term (e.g. a + // follower voting for a candidate at the same term), but in + // that case there are no in-flight transfers and the abort + // is a no-op. + abort_snapshot_transfers(); } if (batch.snp) { @@ -1211,8 +1223,6 @@ future<> server_impl::process_fsm_output(index_t& last_stable, fsm_output&& batc // quickly) stop happening (we're outside the config after all). co_await _apply_entries.push_eventually(removed_from_config{}); } - // request aborts of snapshot transfers - abort_snapshot_transfers(); // abort all read barriers for (auto& r : _reads) { r.promise.set_value(not_a_leader{_fsm->current_leader()});