Commit Graph

28 Commits

Author SHA1 Message Date
Botond Dénes
a14bb4ba94 reader_permit: add inactive state
This state will be used for permits that are not in admitted state when
registered as inactive. We can have such reads if a read can be served
entirely from cache/memtables and it doesn't have to go to disk and
hence doesn't go through admission. These permits currently don't
forward their cost to the semaphore so they won't prevent their own
admission creating a deadlock. However, when in inactive state, we do
want to keep tabs on their resource consumption so we don't accumulate
too much of these inactive reads. So introduce a new state for these
non-admitted inactive reads. When entering the inactive state, the
permit registers its cost with the semaphore, and when unregistered as
inactive, it retracts it. This is a workaround (khm hack) until #4758 is
solved and all permits will be admitted on creation.
2021-03-18 14:58:21 +02:00
Botond Dénes
18454e4a80 reader_concurrency_semaphore: dump permit diagnostics on timeout or queue overflow
The reader concurrency semaphore timing out or its queue being overflown
are fairly common events both in production and in testing. At the same
time it is a hard to diagnose problem that often has a benign cause
(especially during testing), but it is equally possible that it points
to something serious. So when this error starts to appear in logs,
usually we want to investigate and the investigation is lengthy...
either involves looking at metrics or coredumps or both.

This patch intends to jumpstart this process by dumping a diagnostics on
semaphore timeout or queue overflow. The diagnostics is printed to the
log with debug level to avoid excessive spamming. It contains a
histogram of all the permits associated with the problematic semaphore
organized by table, operation and state.

Example:

DEBUG 2020-10-08 17:05:26,115 [shard 0] reader_concurrency_semaphore -
Semaphore _read_concurrency_sem: timed out, dumping permit diagnostics:
Permits with state admitted, sorted by memory
memory  count   name
3499M   27      ks.test:data-query

3499M   27      total

Permits with state waiting, sorted by count
count   memory  name
1       0B      ks.test:drain
7650    0B      ks.test:data-query

7651    0B      total

Permits with state registered, sorted by count
count   memory  name

0       0B      total

Total: permits: 7678, memory: 3499M

This allows determining several things at glance:
* What are the tables involved
* What are the operations involved
* Where is the memory

This can speed up a follow-up investigation greatly, or it can even be
enough on its own to determine that the issue is benign.
2020-10-13 12:32:14 +03:00
Botond Dénes
70fa543c31 reader_concurrency_semaphore: add state to permits
Instead of a simple boolean, designating whether the permit was already
admitted or not, add a proper state field with a value for all the
different states the permit can be in. Currently there are three such
states:
* registered - the permit was created and started accounting resource
  consumption.
* waiting - the permit was queued to wait for admission.
* admitted - the permit was successfully admitted.

The state will be used for debugging purposes, both during coredump
debugging as well as for dumping diagnostics data about permits.
2020-10-13 12:32:13 +03:00
Botond Dénes
ff623e70b3 reader_concurrency_semaphore: name permits
Require a schema and an operation name to be given to each permit when
created. The schema is of the table the read is executed against, and
the operation name, which is some name identifying the operation the
permit is part of. Ideally this should be different for each site the
permit is created at, to be able to discern not only different kind of
reads, but different code paths the read took.

As not all read can be associated with one schema, the schema is allowed
to be null.

The name will be used for debugging purposes, both for coredump
debugging and runtime logging of permit-related diagnostics.
2020-10-13 12:32:13 +03:00
Botond Dénes
73a6b97c75 reader_permit: add consumed_resources() accessor
That allows querying he amount of resources accounted though this
permit, and by extension by this logical read.
2020-10-06 08:18:42 +03:00
Botond Dénes
63578bf0a7 reader_permit: reader_resources: add operator== 2020-09-28 11:27:49 +03:00
Botond Dénes
52662f17ea reader_permit: resource_units: add permit() and resources() accessors 2020-09-28 11:27:29 +03:00
Botond Dénes
c1215592da reader_permit: introduce tracking_allocator
This can be used with standard containers and other containers that use
the std::allocator interface to track the allocations made by them via a
reader_permit.
2020-09-28 08:46:22 +03:00
Botond Dénes
f10abf6e35 reader_permit: reader_resources: add with_memory() factory function
To make creating reader resource with just memory more convenient and
more readable at the same time.
2020-09-28 08:46:22 +03:00
Botond Dénes
4c8ab10563 reader_permit: only forward resource consumption to semaphore after admission
In the next patches we plan to start tracking the memory consumption of
the actual allocations made by the circular_buffer<mutation_fragment>,
as well as the memory consumed by the mutation fragments.
This means that readers will start consuming memory off the permit right
after being constructed. Ironically this can prevent the reader from
being admitted, due to its own pre-admission memory consumption. To
prevent this hold on forwarding the memory consumption to the semaphore,
until the permit is actually admitted.
2020-09-28 08:46:22 +03:00
Botond Dénes
cd953a36fd reader_permit: move internals to impl
In the next patches the reader permit will gain members that are shared
across all instances of the same permit. To facilitate this move all
internals into an impl class, of which the permit stores a shared
pointer. We use a shared_ptr to avoid defining `impl` in the header.

This is how the reader permit started in the beginning. We've done a
full circle. :)
2020-09-28 08:46:22 +03:00
Botond Dénes
12372731cb reader_permit: add consume()/signal()
And do all consuming and signalling through these methods. These
operations will soon be more involved than the simple forwarding they do
today, so we want to centralize them to a single method pair.
2020-09-28 08:46:22 +03:00
Botond Dénes
375815e650 reader_permit::resource_units: store permit instead of semaphore
In the next patches we want to introduce per-permit resource tracking --
that is, have each permit track the amount of resource consumed through
it. For this, we need all consumption to happen through a permit, and
not directly with the semaphore.
2020-09-28 08:46:22 +03:00
Botond Dénes
04d83f6678 reader_permit: move resource_units declaration outside the reader_permit class
In the next patch we want to store a `reader_permit` instance inside
`resource_units` so a full definition of the former must be available.
2020-09-28 08:46:22 +03:00
Botond Dénes
3bb25eefb6 reader_permit: remove unused release() method
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200924090040.240906-1-bdenes@scylladb.com>
2020-09-24 12:28:00 +03:00
Botond Dénes
e5db1ce785 reader_permit: reader_resources: add operator- and operator+
In addition to the already available operator+= and operator-=.
2020-07-20 11:23:39 +03:00
Botond Dénes
3cd2598ab3 reader_permit: forbid empty permits
Remove `no_reader_permit()` and all ways to create empty (invalid)
permits. All permits are guaranteed to be valid now and are only
obtainable from a semaphore.

`reader_permit::semaphore()` now returns a reference, as it is
guaranteed to always have a valid semaphore reference.
2020-05-28 11:34:35 +03:00
Botond Dénes
e40b1fc3c8 reader_permit: fix reader_resources::operator bool 2020-05-28 11:34:35 +03:00
Botond Dénes
f417b9a3ea reader_concurrency_semaphore: remove wait_admission and consume_resources()
Permits are now created with `make_permit()` and code is using the
permit to do all resource consumption tracking and admission waiting, so
we can remove these from the semaphore. This allows us to remove some
now unused code from the permit as well, namely the `base_cost` which
was used to track the resource amount the permit was created with. Now
this amount is also tracked with a `resource_units` RAII object, returned
from `reader_permit::wait_admission()`, so it can be removed. Curiously,
this reduces the reader permit to be glorified semaphore pointer. Still,
the permit abstraction is worth keeping, because it allows us to make
changes to how the resource tracking part of the semaphore works,
without having to change the huge amount of code sites passing around
the permit.
2020-05-28 11:34:35 +03:00
Botond Dénes
bf4ade8917 reader_permit: resource_units: introduce add()
Allows merging two resource_units into one.
2020-05-28 11:34:35 +03:00
Botond Dénes
4d7250d12b reader_permit: add wait_admission
We want to make `read_permit` the single interface through which reads
interact with the concurrency limiting mechanism. So far it was only
usable to track memory consumption. Add the missing `wait_admission()`
and `consume_resources()` to the permit API. As opposed to
`reader_concurrency_semaphore::` equivalents which returned a
permit, the `reader_permit::` variants jut return
`reader_permit::resource_units` which is an RAII holder for the acquired
units. This also allows for the permit to be created earlier, before the
reader is admitted, allowing for tracking pre-admission memory usage as
well. In fact this is what we are going to do in the next patches.

This patch also introduces a `broken()` method on the reader concurrency
semaphore which resolves waiters with an exception. This method is also
called internally from the semaphore's destructor. This is needed
because the semaphore can now have external waiters, who has to be
resolved before the semaphore itself is destroyed.
2020-05-28 11:34:35 +03:00
Botond Dénes
bd793d6e19 reader_permit: resource_units: work in terms of reader_resources
Refactor resource_units semantically as well to work in terms of
reader_resources, instead of just memory.
2020-05-28 11:34:35 +03:00
Botond Dénes
0f9c24631a reader_permit: s/memory_units/resource_units/
We want to refactor reader_permit::memory_units to work in terms of
reader_resources, as we are planning to use it for guarding count
resources as well. This patch makes the first step: renames it from
memory_units to resources_units. Since this is a very noisy change, we
do it in a separate patch, the semantic change is in the next patch.
2020-05-28 11:34:35 +03:00
Botond Dénes
434d32befe reader_permit: tidy up reader_permit::memory_units
This patch is a bag of fixes/cleanups that were omitted from the reader
memory tracking series due to contributor error. It contains the
following changes:
* Get rid of unused `increase()` and `decrease()` methods.
* Make all constructors and assignment operators `noexcept`.
* Make move assignment operator safe w.r.t. self assignment.
* `reset()`: consume the new amount before releasing the old amount,
  to prevent a transient window where new readers might be admitted.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200206143007.633069-1-bdenes@scylladb.com>
2020-02-06 16:35:07 +02:00
Botond Dénes
dea24ca859 reader_permit: expose make_tracked_temporary_buffer()
Previously `tracking_file_impl::make_tracked_buf()`. In the next patches
we plan on using this outside `tracking_file_impl`, so make it public
and templatize on the char type.
2020-01-28 08:13:16 +02:00
Botond Dénes
16cea36a94 reader_permit: introduce make_tracked_file()
Free function equivalent of `reader_resource_tracker::track_file()`,
using a `reader_permit` directly.
2020-01-28 08:13:16 +02:00
Botond Dénes
1859a03629 reader_permit: introduce memory_units
Similar to `seastar::semaphore_units`, this allows consuming and
releasing memory via an RAII object. In addition to that, it also allows
tracking changing values. This feature was designed to be used for
tracking the ever changing memory consumption of the buffers of
`flat_mutation_reader`:s.
This is now the only supported way of consuming memory from a permit.
2020-01-28 08:13:16 +02:00
Botond Dénes
c0f96db2d9 reader_concurrency_semaphore: mv reader_resources and reader_permit to reader_permit.hh
In the next patches we will replace `reader_resource_tracker` and have
code use the `reader_permit` directly. In subsequent patches, the
`reader_permit` will get even more usages as we attempt to make the
tracking of reader resource more accurate by tracking more parts of it.
So the grand plan is that the current `reader_concurrency_semaphore.hh`
is split into two headers:
* `reader_concurrency_semaphore.hh` - containing the semaphore proper.
* `reader_permit.hh` - a very lightweight header, to be used by
  components which only want to track various parts of the resource
  consumption of reads.
2020-01-28 08:13:16 +02:00