raft: abort stale snapshot transfers when term changes

**The Bug**

Assertion failure: `SCYLLA_ASSERT(res.second)` in `raft/server.cc`
when creating a snapshot transfer for a destination that already had a
stale in-flight transfer.

**Root Cause**

If a node loses leadership and later becomes leader again before the next
`io_fiber` iteration, the old transfer from the previous term can remain
in `_snapshot_transfers` while `become_leader()` resets progress state.
When the new term emits `install_snapshot(dst)`, `send_snapshot(dst)`
tries to create a new entry for the same destination and can hit the
assertion.

**The Fix**

Abort all in-flight snapshot transfers in `process_fsm_output()` when
`term_and_vote` is persisted. A term/vote change marks existing transfers
as stale, so we clean them up before dispatching messages from that batch
and before any new snapshot transfer is started.

With cross-term cleanup moved to the term-change path, `send_snapshot()`
now asserts the within-term invariant that there is at most one in-flight
transfer per destination.

Fixes: SCYLLADB-862

Backport: The issue is reproducible in master, but is present in all
active branches.

Closes scylladb/scylladb#29092

(cherry picked from commit 9dad68e58d)

Closes scylladb/scylladb#29264
This commit is contained in:
Emil Maskovsky
2026-03-17 13:50:14 +00:00
committed by Avi Kivity
parent 70b7652e64
commit 666d0440f1

View File

@@ -1101,6 +1101,18 @@ future<> server_impl::process_fsm_output(index_t& last_stable, fsm_output&& batc
// case.
co_await _persistence->store_term_and_vote(batch.term_and_vote->first, batch.term_and_vote->second);
_stats.store_term_and_vote++;
// When the term advances, any in-flight snapshot transfers
// belong to an outdated term: the progress tracker has been
// reset in become_leader() or we are now a follower.
// Abort them before we dispatch this batch's messages, which
// may start fresh transfers for the new term.
//
// A vote may also change independently of the term (e.g. a
// follower voting for a candidate at the same term), but in
// that case there are no in-flight transfers and the abort
// is a no-op.
abort_snapshot_transfers();
}
if (batch.snp) {
@@ -1210,8 +1222,6 @@ future<> server_impl::process_fsm_output(index_t& last_stable, fsm_output&& batc
// quickly) stop happening (we're outside the config after all).
co_await _apply_entries.push_eventually(removed_from_config{});
}
// request aborts of snapshot transfers
abort_snapshot_transfers();
// abort all read barriers
for (auto& r : _reads) {
r.promise.set_value(not_a_leader{_fsm->current_leader()});