The logger is proof against allocation failures, except if
--abort-on-seastar-bad-alloc is specified. If it is, it will crash.
The reclaim stall report is likely to be called in low memory conditions
(reclaim's job is to alleviate these conditions after all), so we're
likely to crash here if we're reclaiming a very low memory condition
and have a large stall simultaneously (AND we're running in a debug
environment).
Prevent all this by disabling --abort-on-seastar-bad-alloc temporarily.
Fixes#11549Closes#11555
We want to consolidate all the logalloc state into a single object: the
shard tracker. Replacing this global with a member in said object is
part of this effort.
These are pretend free functions, accessing globals in the background,
make them a member of the tracker instead, which everything needed
locally to compute them. Callers still have to access these stats
through the global tracker instance, but this can be changed to happen
through a local instance. Soon....
Instead, get the tracker instance from the region. This requires adding
a `region&` parameter to `with_reserve()`.
This brings us one step closer to eliminating the global tracker.
Instead of a separate global segment pool instance, make it a member of
the already global tracker. Most users are inside the tracker instance
anyway. Outside users can access the pool through the global tracker
instance.
For now this member is initialized from the global tracker instance. But
it allows the members of region impl to be detached from said global,
making a step towards removing it.
segment has some members, which simply forward the call to a
segment_pool method, via the global segment_pool instance. Remove these
and make the callers use the segment pool directly instead.
Don't open-code calling the region_impl
_listeners->moved() in region move-constructor
and move-assignment op.
The other._impl->_region might be different then &other
post region::merge so let the region_impl
decide which region* is moved from.
The new_region is also set to region_impl->_region
so need to open-code that either in the said call sites.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
The other _impl is presumed to be engaged already,
so just call other.get_impl() once for both use cases.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
We can't be sure that the other_impl->_region == &other
since it could be a result of a previous merge,
so don't decide for it which region to unlisten to,
let it use its current _region.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Current ~region and region::operator= open-code
region_impl::unlisten. Just call it so it will be
easier to maintain.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
First check if _impl is engaged before accessing it
to set its _region = this in the move constructor and
move assignment operator.
Add unit test for these odd orner cases.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
It was added in d20fae96a2
as a precaution not to invalidate iterators while
traversing _regions. However it is not requried as no allocation
is done on this synchronous path - therefore there is no
point in preventing reclaim.
This will allow making the respective functions const
as they merely return stats and do not modify the tracker impl.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
To make the implementation inline and to prepare
for the next patch that adds a const overload of
this method.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Some methods were also marked inline when declared in the class
definition and in the ir definition site to provide a hint to
the compiler to inline them.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Maintain the const chain by returning a const segment*
from segment_from_idx() const overload.
And add a respective mutable overload to return a mutable segment*.
This is done for a similar change in idx_from_segment.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Add a const noexcept overload of `find_empty()` so that
can_allocate_more_segments can be const noexcept as well.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
With the region heap handle removed from logalloc::region, there is
nothing remaining there that needs violation of the abstraction
boundary, so we can drop these hacks.
The region_group mechanism used an intrusive heap handle embedded in
logalloc::region to allow region_group:s to track the largest region. But
with region_group moved out of logalloc, the handle is out of place.
Move it out, introducing a new intermediate class size_tracked_region
to hold the heap handle. We might eventually merge the new class into
memtable (which derives from it), but that requires a large rearrangement
of unit tests, so defer that.
Currently, a region_listener is added during construction and removed
during destruction. This was done to mimick the old region(region_group&)
constructor, as region_listener replaces region_group.
However, this makes moving the binomial heap handle outside logalloc
difficult. The natural place for the handle is in a derived class
of logalloc::region (e.g. memtable), but members of this derived class
will be destroyed earlier than the logalloc::region here. We could play
trickes with an earlier base class but it's better to just decouple
region lifecycle from listener lifecycle.
Do that be adding listen()/unlisten() methods. Some small awkwardness
remains in that merge() implicitly unlistens (see comment in
region::unlisten).
Unit tests are adjusted.
region_group is an abstraction that allows accounting for groups of
regions, but the cost/benefit ratio of maintaining the abstraction
is poor. Each time we need to change decision algorithm of memtable
flushing (admittedly rarely), we need to distill that into an abstraction
for region_groups and then use it. An example is virtual regions groups;
we wanted to account for the partially flushed memtables and had to
invent region groups to stand in their place.
Rather than continuing to invest in the abstraction, break it now
and move it to the memtable dirty memory manager which is responsible
for making those decisions. The relevant code is moved to
dirty_memory_manager.hh and dirty_memory_manager.cc (new file), and
a new unit test file is added as well.
A downside of the change is that unit testing will be more difficult.
Right now tracker_reclaim_lock uses tracker::impl::reclaiming_lock,
which won't be visible if we want to expose tracker_reclaim_lock and
use it from another translation unit. However, it's simple to switch
to an implementation that doesn't require an unknown-size data member,
and instead increment a counter via a pointer, so do that.
- add conversions between region and region_impl
- add accessor for the binomial heap handle
- add accessor for region_impl::id()
- remove friend declarations
This helps in moving region_group to a different source file, where
the definitions of region_impl will not be visible.
As a first step in moving region_group away from logalloc, decouple
communications between region and region_group. We introduce region_listener,
that listens for the events that region passed directly to region_group.
A region_group now installs a region_listener in a region, instead of
having region know about the region_group directly.
This decoupling is still leaky:
- merge() chooses to forget the merged-from region's region_listener.
This happens to be suitable for the only user of merge().
- We're still embedding the binomial heap handle, used by region_group
to keep track of region sizes, in regions. A complete decoupling would
transfer that responsibility to region_group.