This avoids potential use-after-move, since undefined c++ sequencing order
may std::move(f) in the lambda capture before evaluating f.stat().
Also, this makes use of a more generic library function that doesn't
require to open and hold on to the file in the application.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20200514152054.162168-1-bhalevy@scylladb.com>
This removes the need to include reactor.hh, a source of compile
time bloat.
In some places, the call is qualified with seastar:: in order
to resolve ambiguities with a local name.
Includes are adjusted to make everything compile. We end up
having 14 translation units including reactor.hh, primarily for
deprecated things like reactor::at_exit().
Ref #1
This allows us to drop a #include <reactor.hh>, reducing compile time.
Several translation units that lost access to required declarations
are updated with the required includes (this can be an include of
reactor.hh itself, in case the translation unit that lost it got it
indirectly via logalloc.hh)
Ref #1.
The disk-error-handler is purely auxiliary thing that helps
propagating IO errors to the rest of the code. It well
deserves not sitting in the root namespace.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20200207112443.18475-1-xemul@scylladb.com>
By default, semaphore exceptions bring along very little context:
either that a semaphore was broken or that it timed out.
In order to make debugging easier without introducing significant
runtime costs, a notion of named semaphore is added.
A named semaphore is simply a semaphore with statically defined
name, which is present in its errors, bringing valuable context.
A semaphore defined as:
auto sem = semaphore(0);
will present the following message when it breaks:
"Semaphore broken"
However, a named semaphore:
auto named_sem = named_semaphore(0, named_semaphore_exception_factory{"io_concurrency_sem"});
will present a message with at least some debugging context:
"Semaphore broken: io_concurrency_sem"
It's not much, but it would really help in pinpointing bugs
without having to inspect core dumps.
At the same time, it does not incur any costs for normal
semaphore operations (except for its creation), but instead
only uses more CPU in case an error is actually thrown,
which is considered rare and not to be on the hot path.
Refs #4999
Tests: unit(dev), manual: hardcoding a failure in view building code
The endpoint directories scanned by space_watchdog may get deleted
by the manager::drain_for().
If a deleted directory is given to a lister::scan_dir() this will end up
in an exception and as a result a space_watchdog will skip this round
and hinted handoff is going to be disabled (for all agents including MVs)
for the whole space_watchdog round.
Let's make sure this doesn't happen by serializing the scanning and deletion
using end_point_hints_manager::file_update_mutex.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
end_point_hints_manager::file_update_mutex is taken by space_watchdog
but while space_watchdog is waiting for it the corresponding
end_point_hints_manager instance may get destroyed by manager::drain_for()
or by manager::stop().
This will end up in a use-after-free event.
Let's change the end_point_hints_manager's API in a way that would prevent
such an unsafe locking:
- Introduce the with_file_update_mutex().
- Make end_point_hints_manager::file_update_mutex() method private.
Fixes#4685Fixes#4836
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Rather than {std::experimental,boost,seastar::compat}::filesystem
On Sat, 2019-03-23 at 01:44 +0200, Avi Kivity wrote:
> The intent for seastar::compat was to allow the application to choose
> the C++ dialect and have seastar follow, rather than have seastar choose
> the types and have the application follow (as in your patch).
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
We would like to get rid of boost::filesystem and gradually replace it with
std::experimental::filesystem.
TODO: using namespace fs = std::experimental::filesystem,
use fs::path directly, rather than lister::path
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
When messaging_service is started we may immediately receive a mutation
from another node (e.g. in the MV update context). If hinted handoff is not
ready to store hints at that point we may fail some of MV updates.
We are going to resolve this by start()ing hints::managers before we
start messaging_service and blocking hints replaying until all relevant
objects are initialized.
Refs #3828
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
The space_watchdog enables or disables hints for the managers
associated with a particular device. We encapsulate this decision
inside the hints::managers by introducing the update_backlog()
function.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
A db::hints::resource_manager manages the resources for one or two
db::hints::managers. Each of these can be using the same or different
devices. The db::hints::space_watchdog periodically checks whether
each manager is within their resource allocation, and if not disables
it.
The watchdog iterates over the managers and accounts for the total
size they are using. This is wrong, since it can account in the same
variable the size consumed by managers using different devices.
We fix this while taking advantage of the fact that on_timer is now
called in the context of a seastar::thread, instead of using future
combinators.
Fixes#3821
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Registering a manager for a new device used
std::unordered_map::emplace(), which may not insert the specified
value if one with the same key has already been added. This could
happen if both managers were using the same device and the fiber
deferred in-between adding them.
Found during code reading. Could cause hints to not be disabled for an
overloaded manager.
Fixes#3822
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Reserving 10% of space for hints managers makes sense if the device
is shared with other components (like /data or /commitlog).
But, if hints directory is mounted on a dedicated storage, it makes
sense to reserve much more - 90% was chosen as a sane limit.
Whether storage is 'dedicated' or not is based on a simple check
if given hints directory is a mount point.
Fixes#3516
Signed-off-by: Piotr Sarna <sarna@scylladb.com>
Instead of having one static space limit for all directories,
space_watchdog now keeps a per-device limit, shared among
hints managers residing on the same disks.
References #3516
Signed-off-by: Piotr Sarna <sarna@scylladb.com>
In order to distinguish which directories reside on which devices,
get_device_id function is added to resource manager.
Signed-off-by: Piotr Sarna <sarna@scylladb.com>
Previously max_shard_disk_space_size was unconditionally initialized
with the capacity of hints_directory. But, it's likely that
hints_directory doesn't exist at all if hinted handoff is not enabled,
which results in Scylla failing to boot.
So, max_shard_disk_space_size is now initialized with the capacity
of hints_for_views directory, which is always present.
This commit also moves max_shard_disk_space_size to the .cc file
where it belongs - resource_manager.cc.
Tests: unit (release)
Message-Id: <9f7b86b6452af328c05c5c6c55bfad3382e12445.1528977363.git.sarna@scylladb.com>