Commit Graph

97 Commits

Author SHA1 Message Date
Mikołaj Sielużycki
1d84a254c0 flat_mutation_reader: Split readers by file and remove unnecessary includes.
The flat_mutation_reader files were conflated and contained multiple
readers, which were not strictly necessary. Splitting optimizes both
iterative compilation times, as touching rarely used readers doesn't
recompile large chunks of codebase. Total compilation times are also
improved, as the size of flat_mutation_reader.hh and
flat_mutation_reader_v2.hh have been reduced and those files are
included by many file in the codebase.

With changes

real	29m14.051s
user	168m39.071s
sys	5m13.443s

Without changes

real	30m36.203s
user	175m43.354s
sys	5m26.376s

Closes #10194
2022-03-14 13:20:25 +02:00
Avi Kivity
8f2bc838af Update seastar submodule
* seastar ea6a6820ed...2849a8a8ba (1):
  > Merge "make semaphore and shared_promise abortable" from Gleb

include fixup from Gleb added.
2022-02-25 07:26:11 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Nadav Har'El
6012f6f2b6 build performance: do not include <seastar/net/ip.hh>
In a previous patch, we noticed that the header file <gm/inet_address.hh>,
which is included, directly or indirectly, by most source files,
includes <seastar/net/ip.hh> which is very slow to compile, and
replaced it by the much faster-to-include <seastar/net/ipv[46]_address.hh>.

However, we also included <seastar/net/ip.hh> in types.hh - and that
too is included by almost every file, so the actual saving from the
above patch was minimal. So in this patch we replace this include too.
After this patch Scylla does not include <seastar/net/ip.hh> at all.

According to ClangBuildAnalyzer, this reduces the average time to include
types.hh (multiply this by 312 times!) from 4 seconds to 1.8 seconds,
and reduces total build time (dev mode) by about 3%.

Some of the source files were now missing some include directives, that
were previously included in ip.hh - so we need to add those explicitly.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-01-05 17:29:21 +02:00
Michael Livshin
a1b8ba23d2 reader_concurrency_semaphore: convert to flat_mutation_reader_v2
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2021-12-21 11:26:17 +02:00
Eliran Sinvani
0b2861d014 Prepare for inheriting from reader_concurrency_semaphore
Some future and enterprise features requires us to inherit from
reader_concurrency_semaphore, this might require additional
"wrap up" operations to be done on stop which serves as a barrier
for the semaphore. Here we simply make stop virtual so it is
inherited and can be augmented.
This change have no significant impact on  performance since stop
can get called once in a lifetime of a semaphore.
The approach is to add two extenction points to the
reader_concurrency_semaphore class, one just before the stop code is
executed and one just after.

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>

Closes #9373
2021-09-26 12:57:48 +03:00
Kamil Braun
fbb83dd5ca reader_concurrency_semaphore: remove default parameter values from constructors
It's easy to forget about supplying the correct value for a parameter
when it has a default value specified. It's safer if 'production code'
is forced to always supply these parameters manually.

The default values were mostly useful in tests, where some parameters
didn't matter that much and where the majority of uses of the class are.
Without default values adding a new parameter is a pain, forcing one to
modify every usage in the tests - and there are a bunch of them. To
solve this, we introduce a new constructor which requires passing the
`for_tests` tag, marking that the constructor is only supposed to be
used in tests (and the constructor has an appropriate comment). This
constructor uses default values, but the other constructors - used in
'production code' - do not.
2021-09-14 12:20:28 +02:00
Benny Halevy
4e3dcfd7d6 reader_concurrency_semaphore: use permit timeout for admission
Now that the timeout is stored in the reader
permit use it for admission rather than a timeout
parameter.

Note that evictable_reader::next_partition
currently passes db::no_timeout to
resume_or_create_reader, which propagated to
maybe_wait_readmission, but it seems to be
an oversight of the f_m_r api that doesn't
pass a timeout to next_partition().

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-08-24 16:30:51 +03:00
Benny Halevy
fe479aca1d reader_permit: add timeout member
To replace the timeout parameter passed
to flat_mutation_reader methods.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-08-24 14:29:44 +03:00
Botond Dénes
8fc55fa5bf reader_concurrency_semaphore: get rid of struct permit_list
struct permit_list exists so the intrusive list declaration which needs
the definition of reader_permit can be hidden in the .cc. But it turns
out that if the hook type is fully spelled out, the intrusive list
declaration doesn't need T to be defined. Exploit this to get rid of
this extra indirection.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20210720073121.63027-2-bdenes@scylladb.com>
2021-07-20 10:35:12 +03:00
Botond Dénes
11b39cbc23 reader_concurrency_semaphore: merge permit_stats into stats
If there was any reason to have them separate when permit_stats was
conceived, it is gone now, so merge the two.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20210720073121.63027-1-bdenes@scylladb.com>
2021-07-20 10:35:12 +03:00
Botond Dénes
27fbca84f6 reader_concurrency_semaphore: remove prethrow_action
The semaphore accepts a functor as in its constructor which is run just
before throwing on wait queue overload. This is used exclusively to bump
a counter in the database::stats, which counts queue overloads. However,
there is now an identical counter in
reader_concurrency_semaphore::stats, so the database can just use that
directly and we can retire the now unused prethrow action.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20210716111105.237492-1-bdenes@scylladb.com>
2021-07-19 15:47:37 +03:00
Botond Dénes
1666ad078a reader_concurrency_semaphore: add reads_{admitted,enqueued} stats
Primarily for tests, but we could also export these, should we want to.
2021-07-14 17:19:02 +03:00
Botond Dénes
c86573813f reader_concurrency_semaphore: remove now unused obtain_permit_nowait() 2021-07-14 17:19:02 +03:00
Botond Dénes
1b7eea0f52 reader_concurrency_semaphore: admission: flip the switch
This patch flips two "switches":
1) It switches admission to be up-front.
2) It changes the admission algorithm.

(1) by now all permits are obtained up-front, so this patch just yanks
out the restricted reader from all reader stacks and simultaneously
switches all `obtain_permit_nowait()` calls to `obtain_permit()`. By
doing this admission is now waited on when creating the permit.

(2) we switch to an admission algorithm that adds a new aspect to the
existing resource availability: the number of used/blocked reads. Namely
it only admits new reads if in addition to the necessary amount of
resources being available, all currently used readers are blocked. In
other words we only admit new reads if all currently admitted reads
requires something other than CPU to progress. They are either waiting
on I/O, a remote shard, or attention from their consumers (not used
currently).

We flip these two switches at the same time because up-front admission
means cache reads now need to obtain a permit too. For cache reads the
optimal concurrency is 1. Anything above that just increases latency
(without increasing throughput). So we want to make sure that if a cache
reader hits it doesn't get any competition for CPU and it can run to
completion. We admit new reads only if the read misses and has to go to
disk.

Another change made to accommodate this switch is the replacement of the
replica side read execution stages which the reader concurrency
semaphore as an execution stage. This replacement is needed because with
the introduction of up-front admission, reads are not independent of
each other any-more. One read executed can influence whether later reads
executed will be admitted or not, and execution stages require
independent operations to work well. By moving the execution stage into
the semaphore, we have an execution stage which is in control of both
admission and running the operations in batches, avoiding the bad
interaction between the two.
2021-07-14 17:19:02 +03:00
Botond Dénes
79fefc490c reader_concurrency_semaphore: add set_max_queue_size() 2021-07-14 17:19:02 +03:00
Botond Dénes
00511100a4 reader_concurrency_semaphore: remove now unused make_permit() 2021-07-14 17:19:02 +03:00
Botond Dénes
af8f39a775 reader_concurrency_semaphore: make it an execution stage
The execution stage functionality is exposed via two new member
functions, `with_permit()` and `with_ready_permit()`. Both accept a
function to be run. The former obtains a permit then runs the passed in
function through the execution stage. The latter allows an already
obtained permit to be passed in.
2021-07-14 16:48:43 +03:00
Botond Dénes
5d3ddba2c7 reader_concurrency_semaphore: make_permit(): add up-front admission variants
Three new methods are added for creating permits:
1) obtain_permit()
2) obtain_permit_nowait()
3) make_tracking_only_permit()

(1) is meant to replace `make_permit()` + `wait_admission()`, by
integrating the waiting for admission into the process of creating the
permit. This is the method meant to be used to create permits from here
on, ensuring that each read passes admission before even being started.
(2) is a bridge between the old and new world. Up-front admission cannot
coexist with the restricted reader in the same read, so those reads that
have a restricted reader in their stack can use this method to create a
non-admitted permit to be admitted by the restricted reader later. Once
we have migrated all reads to (1) or (2), we can get rid of the
restricted reader and just replace (1) with (2) in the codebase. (2)
returns a future to make this a simple rename, the churn of dealing with
a future<reader_permit> return type already having been dealt with by
then.
(3) is for reads that bypass admission, yet their resource usage does
participate in the admission of other reads. This is the equivalent of
reads that don't pass admission at all.

The following patches will gradually transition the codebase away from
the old permit API, and once the transition is complete, we can switch
over to do the admission up-front at once.
2021-07-14 16:48:43 +03:00
Botond Dénes
844a99a91a reader_concurrency_semaphore: prepare for up-front admission
We want to make permits be admitted up-front, before even being created.
As part of this change, we will get rid of the `wait_admission()`
method on the permit, instead, the permit will be created as a result of
waiting for admission (just like back some time ago).
To allow evicted readers to wait for re-admission, a new method
`maybe_wait_readmission()` is created, which waits for readmission if
the permit is in evicted state.

Also refactor the internals of the semaphore to support and favor
up-front admission code. As up-front admission is the future we want the
permit code to be organized in such a way that it is natural to use with
it. This means that the "old-style" admission code might suffer but we
tolerate this as it is on its way out. To this end the following changes
were done:
* Add a _base_resources field to reader_permit which tracks the base
  cost of said permit. This is passed in the constructor and is used in
  the first and subsequent admissions.
* The base cost is now managed internally by the permit, instead of
  relying on an external `resource_units` instance, though the old way
  is still supported temporarily.
* Change the admission pipeline to favor the new permit-internally
  managed base cost variant.
* Compatibility with old-style admission: permits are created with 0
  base resources, base resources are set with the compatibility method
  `set_base_resources()` right before admission, then externalized again
  after admission with `base_resource_as_resource_units()`. These
  methods will be gone when the old style admission is retired (together
  with `wait_admission()`).
2021-07-14 16:48:43 +03:00
Botond Dénes
aa480fa3f9 reader_permit: allow marking blocked
Distinguish between permits that are blocked and those that are not.
Conceptually a blocked permit is one that needs to wait on either I/O or
a remote shard to proceed.
This information will be used by admission, which will only admit new
reads when all currently used ones are blocked. More on that in the
commit introducing this new admission type.
This patch only adds the infrastructure, block sites are not marked yet.
2021-07-14 16:48:43 +03:00
Botond Dénes
a5dc48b4b1 reader_permit: allow marking it as used
Distinguish between permits that are used and those that are not. These
are two subtypes of the current 'active' state (and replace it).
Conceptually a permit is used when any readers associated with it have a
pending call to any of their async methods, i.e. the consumer is
actively consuming from them.
This information will be used for admission, together with a new blocked
state introduced by a future patch.
This patch only adds the infrastructure, use sites are not marked yet.
2021-07-14 16:48:43 +03:00
Botond Dénes
5416fc6d1b reader_concurrency_semaphore: add current_permits to permit_stats 2021-07-14 16:48:43 +03:00
Botond Dénes
c97fc16105 reader_concurrency_semaphore: extract waiter admission into separate function
Because soon we will have more than one place to trigger waiter
admission from.
2021-07-14 16:48:43 +03:00
Botond Dénes
03959a332b reader_concurrency_semaphore: add permit-stats
Which stores permit related stats. For now only total number of permits
is maintained which is useful to determine whether the semaphore was
used already or not.
2021-07-12 15:53:00 +03:00
Botond Dénes
c4e71fb9b8 reader_concurrency_semaphore: remove default name parameter
Naming the concurrency semaphore is currently optional, unnamed
semaphores defaulting to "Unnamed semaphore". Although the most
important semaphores are named, many still aren't, which makes for a
poor debugging experience when one of these times out.
To prevent this, remove the name parameter defaults from those
constructors that have it and require a unique name to be passed in.
Also update all sites creating a semaphore and make sure they use a
unique name.
2021-07-08 12:31:36 +03:00
Botond Dénes
09309f5dbf reader_concurrency_semaphore: on_permit_created(): remove noexcept
The permit creation path enters the semaphore's permit gate in
on_permit_created(). Entering this gate can throw so this method is not
noexcept. Remove the noexcept specifier accordingly.
Also enter the gate before adding the permit to the permit list, to save
some work when this fails.

Fixes: #8933

Tests: unit(dev)
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20210628074941.32878-1-bdenes@scylladb.com>
2021-06-28 11:04:38 +03:00
Botond Dénes
578a092e4a reader_concurrency_semaphore: wait for all permits to be destroyed in stop()
To prevent use-after-free resulting from any permit out-living the
semaphore.
2021-06-16 11:29:36 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Botond Dénes
0a908a47d6 reader_concurrency_semaphore: dump_reader_diagnostics(): cap number of printed lines
This report is logged, so we don't want huge printouts, cap the table at
20 lines, and print only a summary for the rest.
For manual dumps, allow the limit to be set to a custom value, including
no limit at all.
2021-05-10 10:15:47 +03:00
Botond Dénes
d246e2df0a reader_concurrency_semaphore: add dump_diagnostics()
Allow semaphore related tests to include a diagnostics printout in error
messages to help determine why the test failed.
2021-04-26 15:56:56 +03:00
Benny Halevy
c8e30db5db reader_concurrency_semaphore: close evicted reader
Close readers in the background:
- evicted based on ttl, or
- those that weren't admitted by register_inactive_read
- those that are destoryed in clear_inactive_reads.

Use a gate for waiting on these background closes in stop().

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:35:07 +03:00
Benny Halevy
be1cafc1a5 reader_concurrency_semaphore: do_wait_admission: close evicted readers
enqueue_waiter before evicting readers and start a loop in the
background to dequeue and close inactive_readers until
either the _wait_list is empty or there are no more inactive_readers
to evict.

We admit the read synchronously only if the wait_list is empty
and the semaphore has_available_units to statisfy admission.

We need to enqueue the reader before starting to evict readers
to make sure any evicted resources are assigned to the
waiter at the head of the queue and not "stolen"
in case we yield and some other caller grabs them.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:35:07 +03:00
Benny Halevy
43bf0f9356 reader_concurrency_semaphore: add stop method
In addition to clear_inactive_reads, that's currently called when
the database object is destroyed, introduce a stop() method that will:
1. wait on all background closes of inactive_reads.
2. close all present inactive_reads and waits on their close.
3. signal waiters on the wait_list via broken() with a proper
   exception indicating that the semaphore was closed.

In addition, assert in the semaphore's destructor
that it has no remaining inactive reads.

Stop must be called from whoever owns the r_c_s.
Mainly, from database::stop.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:35:07 +03:00
Benny Halevy
2f4134e1cc reader_concurrency_semaphore: broken: make broken_semaphore the default exception
Rather than explcitily generating it by all callers
and then not using the argument at all.

Prepare for providing a different exception_ptr
from a stop() path to be introduced in the next patch.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:35:07 +03:00
Botond Dénes
4762b84b44 reader_concurrency_semaphore: remove now unused may_proceed() 2021-03-30 17:54:34 +03:00
Botond Dénes
d1dd55d98f reader_concurrency_semaphore: extract enqueueing logic into enqueue_waiter()
Besides making the code more readable, this also enables restructuring
`do_wait_admission()`, without moving too much code around.
As a bonus, queue length is now only checked when the permit actually
has to be enqueued.
2021-03-30 17:49:30 +03:00
Botond Dénes
1a337d0ec1 reader_concurrency_semaphore: fix clear_inactive_reads()
Broken by the move to an intrusive container (9cbbf40), which caused
said method to only clear the container but not destroy the inactive
reads contained therein. This patch restores the previous behaviour and
also adds a call the destructor (to ensure inactive reads are cleaned up
under any circumstances), as well as a unit test.
2021-03-18 14:57:57 +02:00
Botond Dénes
581edc4e4e reader_concurrency_semaphore: make inactive_read_handle a weak reference
Having the handle keep an owning reference to the inactive read lead to
awkward situations, where the inactive read is destroyed during eviction
in certain situations only (querier cache) and not in other cases.
Although the users didn't notice anything from this, it lead to very
brittle code inside the reader concurrency semaphore. Among others, the
inactive read destructor has to be open coded in evict() which already
lead to mistakes.
This patch goes back to the weak pointer paradigm used a while ago,
which is a much more natural fit for this. Inactive reads are still kept
in an intrusive list in the semaphore but the handle now keeps a weak
pointer to them. When destroyed the handler will destroy the inactive
read if it is still alive. When evicting the inactive read, it will
set the pointer in the handle to null.
2021-03-18 14:57:57 +02:00
Botond Dénes
cbc83b8b1b reader_concurrency_semaphore: make evict() noexcept
In the next patch it will be called from a destructor.
2021-03-18 14:57:57 +02:00
Botond Dénes
2d348e0211 reader_concurrency_semaphore: update out-of-date comments 2021-03-18 14:57:57 +02:00
Benny Halevy
6e92f07630 reader_concurrency_semaphore: register_inactive_read: make noexcept
Catch error to allocate an inactive_read and just log them.
Return an empty inactive_read_handle in
this case, as if the inactive reader was evicted due to
lack of resources.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-02-08 22:31:01 +02:00
Benny Halevy
46c2229b78 reader_concurrency_semaphore: separate set_notify_handler from register_inactive_reader
Register the inactive reader first with no
evict_notify_handler and ttl.

Those can be set later, only if registration succeeded.
Otherwise, as in the querier example, there is no need
to to place the querier in the index and erase it
on eviction.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-02-08 22:31:01 +02:00
Benny Halevy
d752ea7e91 reader_concurrency_semaphore: inactive_read: make ttl_timer non-optional
By default it will be unarmed and with no callback
so there's no need to wrap it in a std::optional.

This saves an allocation and another potential
error case.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-02-08 22:31:01 +02:00
Benny Halevy
a12c9638b6 reader_concurrency_semaphore: inactive_read: use intrusive list
To simplify insertion and eviction into the inactive_reads container,
use an intrusive list thta requires a single allocation for the
inactive_read object itself.

This allows passing a reference to the inactive_read
to evict it.

Note that the reader will be unlinked automatically from
the inactive_readers list if the inactive_read_handle is destroyed.
This is okay since there is no need to track the inactive_read
if the caller loses the i_r_h (e.g. if an error is thrown).

It is also safe to evict the inactive_reader while the
i_r_h is alive.  In this case the i_r will be unlinked
after the flat_mutation_reader it holds is moved out of it.

bi::auto_unlink will detect that it's alredy unlinked
when destroyed and do nothing.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-02-08 22:31:01 +02:00
Benny Halevy
81cd3d0c51 reader_concurrency_semaphore: try_evict_one_inactive_read: pass evict_reason
So try_evict_one_inactive_read could be used also in do_wait_admission
in the next patch.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-02-08 20:32:40 +02:00
Benny Halevy
769dff6c54 reader_concurrency_semaphore: inactive_read_handle: swap definition order
For using boost::intrusive::list for _inactive_reads.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-02-08 20:32:40 +02:00
Benny Halevy
4e8f29ef14 reader_concurrency_semaphore: inactive_read: keep a flat_mutation_reader
There's no need to hold a unique_ptr<flat_mutation_reader> as
flat_mutation_reader itself holds a unique_ptr<flat_mutation_reader::impl>
and functions as a unique ptr via flat_mutation_reader_opt.

With that, unregister_inactive_read was modified to return a
flat_mutation_reader_opt rather than a std::unique_ptr<flat_mutation_reader>,
keeping exactly the same semantics.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-02-08 20:32:40 +02:00
Benny Halevy
338c190842 reader_concurrency_semaphore: inactive_read_handle: mark methods noexcept
All are trivially noexcept.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210204113327.1027792-1-bhalevy@scylladb.com>
2021-02-04 13:57:42 +02:00
Avi Kivity
913d970c64 Merge "Unify inactive readers" from Botond
"
Currently inactive readers are stored in two different places:
* reader concurrency semaphore
* querier cache
With the latter registering its inactive readers with the former. This
is an unnecessarily complex (and possibly surprising) setup that we want
to move away from. This series solves this by moving the responsibility
if storing of inactive reads solely to the reader concurrency semaphore,
including all supported eviction policies. The querier cache is now only
responsible for indexing queriers and maintaining relevant stats.
This makes the ownership of the inactive readers much more clear,
hopefully making Benny's work on introducing close() and abort() a
little bit easier.

Tests: unit(release, debug:v1)
"

* 'unify-inactive-readers/v2' of https://github.com/denesb/scylla:
  reader_concurrency_semaphore: store inactive readers directly
  querier_cache: store readers in the reader concurrency semaphore directly
  querier_cache: retire memory based cache eviction
  querier_cache: delegate expiry to the reader_concurrency_semaphore
  reader_concurrency_semaphore: introduce ttl for inactive reads
  querier_cache: use new eviction notify mechanism to maintain stats
  reader_concurrency_semaphore: add eviction notification facility
  reader_concurrency_semaphore: extract evict code into method evict()
2021-02-03 10:59:04 +02:00