Commit Graph

325 Commits

Author SHA1 Message Date
Benny Halevy
888e225113 logalloc: tracker::impl: mark functions noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:39:16 +03:00
Benny Halevy
f0027f60d4 logalloc: segment: mark functions const / noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:34:56 +03:00
Benny Halevy
830912cfa0 logalloc: segment_pool: add const variant of descriptor method
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:34:48 +03:00
Benny Halevy
f318d1664e logalloc: segment_pool: move descriptor method to class definition
To make the implementation inline and to prepare
for the next patch that adds a const overload of
this method.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:34:37 +03:00
Benny Halevy
35899463d4 logalloc: segment_pool: mark functions const/noexcept
Some methods were also marked inline when declared in the class
definition and in the ir definition site to provide a hint to
the compiler to inline them.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:33:47 +03:00
Benny Halevy
02e74696f2 logalloc: segment_pool: delete unused free_or_restore_to_reserve method
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:33:21 +03:00
Benny Halevy
da87a4a248 logalloc: segment_store, segment_pool: idx_from_segment: get a const segment* in const overload
To maintain the const chain from segment via segment_store to
segment_pool.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:28:21 +03:00
Benny Halevy
947f71ce91 logalloc: segment_store, segment_pool: return const segment* from segment_from_idx() const
Maintain the const chain by returning a const segment*
from segment_from_idx() const overload.

And add a respective mutable overload to return a mutable segment*.

This is done for a similar change in idx_from_segment.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:27:30 +03:00
Benny Halevy
17902da66c logalloc: segment_store: make can_allocate_more_segments const
Add a const noexcept overload of `find_empty()` so that
can_allocate_more_segments can be const noexcept as well.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:26:07 +03:00
Benny Halevy
2ae61d5209 logalloc: segment_store: mark functions noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:19:58 +03:00
Benny Halevy
852c23b97a logalloc: segment_descriptor: mark functions noexcept
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-27 13:18:15 +03:00
Avi Kivity
5b541bed72 logalloc: drop region_impl public accessors
With the region heap handle removed from logalloc::region, there is
nothing remaining there that needs violation of the abstraction
boundary, so we can drop these hacks.
2022-07-26 11:12:10 +03:00
Avi Kivity
2cb5f79e9d logalloc, dirty_memory_manager: move size-tracking binomial heap out of logalloc
The region_group mechanism used an intrusive heap handle embedded in
logalloc::region to allow region_group:s to track the largest region. But
with region_group moved out of logalloc, the handle is out of place.

Move it out, introducing a new intermediate class size_tracked_region
to hold the heap handle. We might eventually merge the new class into
memtable (which derives from it), but that requires a large rearrangement
of unit tests, so defer that.
2022-07-26 11:12:10 +03:00
Avi Kivity
ee720fa23b logalloc: relax lifetime rules around region_listener
Currently, a region_listener is added during construction and removed
during destruction. This was done to mimick the old region(region_group&)
constructor, as region_listener replaces region_group.

However, this makes moving the binomial heap handle outside logalloc
difficult. The natural place for the handle is in a derived class
of logalloc::region (e.g. memtable), but members of this derived class
will be destroyed earlier than the logalloc::region here. We could play
trickes with an earlier base class but it's better to just decouple
region lifecycle from listener lifecycle.

Do that be adding listen()/unlisten() methods. Some small awkwardness
remains in that merge() implicitly unlistens (see comment in
region::unlisten).

Unit tests are adjusted.
2022-07-26 11:12:10 +03:00
Avi Kivity
fbe8ea7727 logalloc, dirty_memory_manager: move region_group and associated code
region_group is an abstraction that allows accounting for groups of
regions, but the cost/benefit ratio of maintaining the abstraction
is poor. Each time we need to change decision algorithm of memtable
flushing (admittedly rarely), we need to distill that into an abstraction
for region_groups and then use it. An example is virtual regions groups;
we wanted to account for the partially flushed memtables and had to
invent region groups to stand in their place.

Rather than continuing to invest in the abstraction, break it now
and move it to the memtable dirty memory manager which is responsible
for making those decisions. The relevant code is moved to
dirty_memory_manager.hh and dirty_memory_manager.cc (new file), and
a new unit test file is added as well.

A downside of the change is that unit testing will be more difficult.
2022-07-26 11:12:10 +03:00
Avi Kivity
bffee2540f logalloc: expose tracker_reclaimer_lock
tracker_reclaimer_lock is used by region_group, which is being moved
out of logalloc, so expose it.
2022-07-26 11:12:10 +03:00
Avi Kivity
4ba0658670 logalloc: reimplement tracker_reclaim_lock to avoid using hidden classes
Right now tracker_reclaim_lock uses tracker::impl::reclaiming_lock,
which won't be visible if we want to expose tracker_reclaim_lock and
use it from another translation unit. However, it's simple to switch
to an implementation that doesn't require an unknown-size data member,
and instead increment a counter via a pointer, so do that.
2022-07-26 11:12:10 +03:00
Avi Kivity
652ab6f4a2 logalloc: reduce friendship between region and region_group
- add conversions between region and region_impl
 - add accessor for the binomial heap handle
 - add accessor for region_impl::id()
 - remove friend declarations

This helps in moving region_group to a different source file, where
the definitions of region_impl will not be visible.
2022-07-26 11:12:10 +03:00
Avi Kivity
c91ee9d04e logalloc: decouple region_group from region
As a first step in moving region_group away from logalloc, decouple
communications between region and region_group. We introduce region_listener,
that listens for the events that region passed directly to region_group.
A region_group now installs a region_listener in a region, instead of
having region know about the region_group directly.

This decoupling is still leaky:
 - merge() chooses to forget the merged-from region's region_listener.
  This happens to be suitable for the only user of merge().
 - We're still embedding the binomial heap handle, used by region_group
   to keep track of region sizes, in regions. A complete decoupling would
   transfer that responsibility to region_group.
2022-07-26 11:12:03 +03:00
Michael Livshin
ca21ce8e6f utils: logalloc: fix indentation
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-07-14 19:40:09 +03:00
Michael Livshin
bcb7404a0e utils: logalloc: split the reclaim_timer in compact_and_evict_locked()
(Into one for the compact part and one for the evict part)

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-07-14 19:40:09 +03:00
Michael Livshin
007d8fb5c9 utils: logalloc: report segment stats if reclaim_segments() times out
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-07-14 19:40:09 +03:00
Michael Livshin
1d700442ae utils: logalloc: reclaim_timer: add optional extra log callback
The idea is to let the caller add arbitrary extra info to the timeout
report.

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-07-14 19:40:09 +03:00
Michael Livshin
abd7b9f01c utils: logalloc: reclaim_timer: report non-decreasing durations
The hope is that this reduces logspam without losing utility.

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-07-14 19:40:09 +03:00
Michael Livshin
07fdcb268e utils: logalloc: have reclaim_timer print reserve limits
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-07-14 19:40:09 +03:00
Michael Livshin
256b911fbd utils: logalloc: move reclaim timer destructor for more readability
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-07-14 19:40:09 +03:00
Michael Livshin
c15e384507 utils: logalloc: define a proper bundle type for reclaim_timer stats
And define/use arithmetics on it.

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-07-14 19:40:09 +03:00
Michael Livshin
0eefbfa3cc utils: logalloc: add arithmetic operations to segment_pool::stats
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-07-14 19:40:09 +03:00
Michael Livshin
3fced65542 utils: logalloc: have reclaim timers detect being nested
Make sure that inner timers don't waste CPU measuring anything.

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-07-14 19:40:09 +03:00
Benny Halevy
76ca93b779 utils: logalloc: add more reclaim_timers
Measure stalls at higher resolution.

Refs #6189

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-14 19:40:09 +03:00
Benny Halevy
42db63d012 utils: logalloc: move reclaim_timer to compact_and_evict_locked
track compact_and_evict_locked timing from
all call paths, not only from compact_and_evict.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-14 19:40:09 +03:00
Benny Halevy
fd2b4a4b7d utils: logalloc: pull reclaim_timer definition forward
So it can be used in functions defined earlier in the source file
in the next patch.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-14 19:40:09 +03:00
Benny Halevy
33785d261e utils: logalloc: reclaim_timer make tracker optional
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-14 19:40:09 +03:00
Benny Halevy
acd82d3b25 utils: logalloc: reclaim_timer: print backtrace if stall detected
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-14 19:40:09 +03:00
Benny Halevy
239992f16c utils: logalloc: reclaim_timer: get call site name
Before adding even more call sites, print the call site
name in the report.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-14 19:40:09 +03:00
Benny Halevy
c4d64c3bf7 utils: logalloc: reclaim_timer: rename set_result
Rename set_result to set_memory_released
to make it clearer what the result means.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-14 19:40:09 +03:00
Benny Halevy
5ce0038e6a utils: logalloc: reclaim_timer: rename _reserve_segments member
Rename reclaim_timer::_reserve_segments to _segments_to_release
as it is clearer and more suitable for later patches
that will add reclaim_timers in more functions.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-14 19:40:09 +03:00
Benny Halevy
c34d1a7705 utils: logalloc: reclaim_timer round up microseconds
better report 29000 us than 28999 us.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-07-14 19:40:09 +03:00
Avi Kivity
528ab5a502 treewide: change metric calls from make_derive to make_counter
make_derive was recently deprecated in favor of make_counter, so
make the change throughput the codebase.

Closes #10564
2022-05-14 12:53:55 +02:00
Botond Dénes
f8da0a8d1e Merge "Conceptualize some static assertions" From Pavel Emelyanov
"
Some templates put constraints onto the involved types with the help of
static assertions. Having them in form of concepts is much better.

tests: unit(dev)
"

* 'br-static-assert-to-concept' of https://github.com/xemul/scylla:
  sstables: Remove excessive type-match assertions
  mutation_reader: Sanitize invocable asserion and concept
  code: Convert is_future result_of assertions into invoke_result concept
  code: Convert is_same+result_of assertions into invocable concepts
  code: Convert nothrow construction assertions into concepts
  code: Convert is_integral assertions to concepts
2022-02-28 13:58:01 +02:00
Tomasz Grabiec
1d75a8c843 lsa: Abort when trying to free a standard allocator object not
allocated through the region

It indicates alloc-dealloc mismatch, and can cause other problems in
the systems like unable to reclaim memory. We want to catch this at
the deallocation site to be able to quickly indentify the offender.

Misbehavior of this sort can cause fake OOMs due to underflow of
_non_lsa_memory_in_use. When it underflows enough,
shard_segment_pool.total_memory() will become 0 and memory reclamation
will stop doing anything.

Refs #10056
2022-02-25 01:42:15 +01:00
Tomasz Grabiec
9dd4153c16 lsa: Abort when _non_lsa_memory_in_use goes negative
It indicates alloc-dealloc mismatch, and can cause other problems in
the systems like unable to reclaim memory. Catch early.

Refs #10056
2022-02-25 01:42:15 +01:00
Pavel Emelyanov
645896335d code: Convert is_same+result_of assertions into invocable concepts
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-02-24 19:46:10 +03:00
Tomasz Grabiec
7ee79fa770 logalloc: Add more logging
Message-Id: <20220127232009.314402-1-tgrabiec@scylladb.com>
2022-01-28 14:12:33 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
4118f2d8be treewide: replace deprecated seastar::later() with seastar::yield()
seastar::later() was recently deprecated and replaced with two
alternatives: a cheap seastar::yield() and an expensive (but more
powerful) seastar::check_for_io_immediately(), that corresponds to
the original later().

This patch replaces all later() calls with the weaker yield(). In
all cases except one, it's unambiguously correct. In one case
(test/perf scheduling_latency_measurer::stop()) it's not so ambiguous,
since check_for_io_immediately() will additionally force a poll and
so will cause more work to be done (but no additional tasks to be
executed). However, I think that any measurement that relies on
the measuring the work on the last tick to be inaccurate (you need
thousands of ticks to get any amount of confidence in the
measurement) that in the end it doesn't matter what we pick.

Tests: unit (dev)

Closes #9904
2022-01-12 12:19:19 +01:00
Tomasz Grabiec
7038dc7003 lsa: Fix segment leak on memory reclamation during alloc_buf
alloc_buf() calls new_buf_active() when there is no active segment to
allocate a new active segment. new_buf_active() allocates memory
(e.g. a new segment) so may cause memory reclamation, which may cause
segment compaction, which may call alloc_buf() and re-enter
new_buf_active(). The first call to new_buf_active() would then
override _buf_active and cause the segment allocated during segment
compaction to be leaked.

This then causes abort when objects from the leaked segment are freed
because the segment is expected to be present in _closed_segments, but
isn't. boost::intrusive::list::erase() will fail on assertion that the
object being erased is linked.

Introduced in b5ca0eb2a2.

Fixes #9821
Fixes #9192
Fixes #9825
Fixes #9544
Fixes #9508
Refs #9573

Message-Id: <20211229201443.119812-1-tgrabiec@scylladb.com>
2021-12-30 11:02:08 +02:00
Avi Kivity
f907205b92 utils: logalloc: correct and adjust timing unit in stall report
The stall report uses the millisecond unit, but actually reports
nanoseconds.

Switch to microseconds (milliseconds are a bit too coarse) and
use the safer "duration / 1us" style rather than "duration::count()"
that leads to unit confusion.

Fixes #9733.

Closes #9734
2021-12-06 09:51:57 +02:00
Tomasz Grabiec
bf6898a5a0 lsa: Add sanity checks around lsa_buffer operations
We've been observing hard to explain crashes recently around
lsa_buffer destruction, where the containing segment is absent in
_segment_descs which causes log_heap::adjust_up to abort. Add more
checks to catch certain impossible senarios which can lead to this
sooner.

Refs #9192.
Message-Id: <20211116122346.814437-1-tgrabiec@scylladb.com>
2021-11-16 14:25:02 +02:00
Tomasz Grabiec
4d627affc3 lsa: Mark compact_segment_locked() as noexcept
We cannot recover from a failure in this method. The implementation
makes sure it never happens. Invariants will be broken if this
throws. Detect violations early by marking as noexcept.

We could make it exception safe and try to leave the data structures
in a consistent state but the reclaimer cannot make progress if this throws, so
it's pointless.

Refs #9192
Message-Id: <20211116122019.813418-1-tgrabiec@scylladb.com>
2021-11-16 14:23:10 +02:00