storage_proxy: log gate_closed_exception
gate_closed_exception likely signals that we have shutdown order issues. If we just swallow it we lose information what exact component was shutdown prematurely. For example, we stopped local storage before group0 during shutdown in main.cc. If a group0 command arrives, topology_state_load might try to write something and get mutation_write_failure_exception, which results in 'applier fiber stopped because of the error'. There is no other information in the logs in this case, other than 'mutation_write_failure_exception'. It's not clear what the original problem is and what component is triggering it. In this commit we add a warning to the logs when gate_closed_exception is thrown from lmutate or rmutate. Another option is to just remove the try_catch_nested line and allow gate_closed_exception to be logged as an error below. However, this might break some tests which check ERROR lines in the logs.
This commit is contained in:
@@ -4378,8 +4378,10 @@ void storage_proxy::send_to_live_endpoints(storage_proxy::response_id_type respo
|
||||
msg = stale->what();
|
||||
} else if (try_catch_nested<rpc::closed_error>(eptr)) {
|
||||
// ignore, disconnect will be logged by gossiper
|
||||
} else if (try_catch_nested<seastar::gate_closed_exception>(eptr)) {
|
||||
// may happen during shutdown, ignore it
|
||||
} else if (const auto* e = try_catch_nested<seastar::gate_closed_exception>(eptr)) {
|
||||
// may happen during shutdown, log and ignore it
|
||||
slogger.warn("gate_closed_exception during mutation write to {}: {}",
|
||||
coordinator, e->what());
|
||||
} else if (try_catch<timed_out_error>(eptr)) {
|
||||
// from lmutate(). Ignore so that logs are not flooded
|
||||
// database total_writes_timedout counter was incremented.
|
||||
|
||||
Reference in New Issue
Block a user