The handler is intended to be called when internal invariants are
violated and the operation cannot safely continue. The handler either
throws (default) or aborts, depending on configuration option.
Passing --abort-on-internal-error on the command line will switch to
aborting.
The reason we don't abort by default is that it may bring the whole
cluster down and cause unavailability, while it may not be necessary
to do so. It's safer to fail just the affected operation,
e.g. repair. However, failing the operation with an exception leaves
little information for debugging the root cause. So the idea is that the
user would enable aborts on only one of the nodes in the cluster to
get a core dump and not bring the whole cluster down.
There are various call-sites that explicitly check for EEXIST and
ENOENT:
$ git grep "std::error_code(E"
database.cc: if (e.code() != std::error_code(EEXIST, std::system_category())) {
database.cc: if (e.code() != std::error_code(ENOENT, std::system_category())) {
database.cc: if (e.code() != std::error_code(ENOENT, std::system_category())) {
database.cc: if (e.code() != std::error_code(ENOENT, std::system_category())) {
sstables/sstables.cc: if (e.code() == std::error_code(ENOENT, std::system_category())) {
sstables/sstables.cc: if (e.code() == std::error_code(ENOENT, std::system_category())) {
Commit 961e80a ("Be more conservative when deciding when to shut down
due to disk errors") turned these errors into a storage_io_exception
that is not expected by the callers, which causes 'nodetool snapshot'
functionality to break, for example.
Whitelist the two error codes to revert back to the old behavior of
io_check().
Message-Id: <1465454446-17954-1-git-send-email-penberg@scylladb.com>
Currently we only shut down on EIO. Expand this to shut down on any
system_error.
This may cause us to shut down prematurely due to a transient error,
but this is better than not shutting down due to a permanent error
(such as ENOSPC or EPERM). We may whitelist certain errors in the future
to improve the behavior.
Fixes#1311.
Message-Id: <1465136956-1352-1-git-send-email-avi@scylladb.com>