Files
scylladb/test/cluster
Petr Gusev d922c43358 strong consistency: abort raft server before gate close when dropping a table
When a strongly consistent table is dropped, schedule_raft_group_deletion()
used to call g->close() first, which waits for all in-flight operations to
release their gate holders. But other nodes may have already destroyed their
raft servers for this group, so an in-flight write on the leader cannot
reach quorum and hangs until the client timeout expires, unnecessarily
delaying group deletion.

Fix: initiate gate close (prevents new operations from entering), then
abort the raft server (causes in-flight add_entry/read_barrier to throw
raft::stopped_error, releasing their gate holders), then await the gate
future (resolves immediately since holders are now released).

Handle raft::stopped_error in the coordinator's top-level catch blocks
(both write and read paths): if the table no longer exists, return
no_such_column_family (which the CQL layer converts to InvalidRequest
'unconfigured table'). Otherwise fall through to the default timeout
handling.

Also replace gate->hold() with try_hold() + on_internal_error in
acquire_server, and handle the timeout exception in the wait-for-leader
loop in update() gracefully (log + break instead of propagating).

Fixes: SCYLLADB-2080
2026-05-27 12:06:46 +02:00
..
2026-05-18 12:23:40 +02:00