Commit Graph

226 Commits

Author SHA1 Message Date
Ernest Zaslavsky
d624413ddd treewide: Move query related files to a new query directory
As requested in #22120, moved the files and fixed other includes and build system.

Moved files:
- query.cc
- query-request.hh
- query-result.hh
- query-result-reader.hh
- query-result-set.cc
- query-result-set.hh
- query-result-writer.hh
- query_id.hh
- query_result_merger.hh

Fixes: #22120

This is a cleanup, no need to backport

Closes scylladb/scylladb#25105
2025-09-16 23:40:47 +03:00
Benny Halevy
679e73053f reader_concurrency_semaphore: use named gate
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-04-12 11:28:48 +03:00
Botond Dénes
7ba29ec46c reader_concurrency_semaphore: register_inactive_read(): handle aborted permit
It is possible that the permit handed in to register_inactive_read() is
already aborted (currently only possible if permit timed out).
If the permit also happens to have wait for memory, the current code
will attempt to call promise<>::set_exception() on the permit's promise
to abort its waiters. But if the permit was already aborted via timeout,
this promise will already have an exception and this will trigger an
assert. Add a separate case for checking if the permit is aborted
already. If so, treat it as immediate eviction: close the reader and
clean up.

Fixes: scylladb/scylladb#22919
2025-02-28 01:32:46 -05:00
Botond Dénes
b87f5a0b58 reader_concurrency_semaphore: remove reduntant inactive_read::ttl_timer
It is redundant with reader_permit::impl::_ttl_timer. Use the latter for
TTL of inactive reads too. The usage of the two exclude each other, at
any point in time, either one or the other is used, so no reason to keep
both.

Closes scylladb/scylladb#22863
2025-02-17 11:41:16 +03:00
Botond Dénes
15126e4c9f reader_concurrency_semaphore: use std::ranges::for_each()
Instead of boost::for_each().

Closes scylladb/scylladb#22862
2025-02-17 11:35:32 +03:00
Botond Dénes
9174f27cc8 reader_concurrency_semaphore: set_notify_handler(): disable timeout
set_notify_handler() is called after a querier was inserted into the
querier cache. It has two purposes: set a callback for eviction and set
a TTL for the cache entry. This latter was not disabling the
pre-existing timeout of the permit (if any) and this would lead to
premature eviction of the cache entry if the timeout was shorter than
TTL (which his typical).
Disable the timeout before setting the TTL to prevent premature
eviction.

Fixes: #scylladb/scylladb#22629
2025-02-07 02:31:01 -05:00
Botond Dénes
a3ae0c7cee reader_permit: mark check_abort() as const
All it does is read one field, making it const makes using it easier.
2025-02-07 01:32:35 -05:00
Botond Dénes
f2d5819645 reader_concurrency_semaphore: with_permit(): proper clean-up after queue overload
with_permit() creates a permit, with a self-reference, to avoid
attaching a continuation to the permit's run function. This
self-reference is used to keep the permit alive, until the execution
loop processes it. This self reference has to be carefully cleared on
error-paths, otherwise the permit will become a zombie, effectively
leaking memory.
Instead of trying to handle all loose ends, get rid of this
self-reference altogether: ask caller to provide a place to save the
permit, where it will survive until the end of the call. This makes the
call-site a little bit less nice, but it gets rid of a whole class of
possible bugs.

Fixes: #22588

Closes scylladb/scylladb#22624
2025-02-04 21:27:16 +02:00
Botond Dénes
e1b1a2068a reader_concurrency_semaphore: foreach_permit(): include _inactive_reads
So inactive reads show up in semaphore diagnostics dumps (currently the
only non-test user of this method).

Fixes: #22574

Closes scylladb/scylladb#22575
2025-01-30 22:46:57 +02:00
Łukasz Paszkowski
aad46bd6f3 reader_concurrency_semaphore: do_wait_admission(): remove dumping diagnostics
The commit b39ca29b3c introduced detection of admission-waiter
anomaly and dumps permit diagnostics as soon as the semaphore did
not admit readers even though it could.

Later on, the commit bf3d0b3543 introduces the optimization where
the admission check is moved to the fiber processing the _read_list.
Since the semaphore no longer admits readers as soon as it can,
dumping diagnostic errors is not necessary as the situation is not
abnormal.

Closes scylladb/scylladb#22344
2025-01-16 19:23:43 -05:00
Tomasz Grabiec
bf3d0b3543 reader_concurrency_semaphore: Optimize resource_units destruction by postponing wait list processing
Observed 3% throughput improvement in sstable-heavy workload bounded by CPU.

SStable parsing involves lots of buffer operations which obtain and
destroy resource_units. Before the patch, reosurce_unit destruction
invoked maybe_admit_waiters(), which performs some computations on
waiting permits. We don't really need to admit on each change of
resources, since the CPU is used by other things anyway. We can batch
the computation. There is already a fiber which does this for
processing the _ready_list. We can reuse it for processing _wait_list
as well.

The changes violate an assumption made by tests that releasing
resources immediately triggers an admission check. Therefore, some of
the BOOST_REQUIRE_EQUAL needs to be replaced with REQUIRE_EVENTUALLY_EQUAL
as the admision check is now done in the fiber processing the _ready_list.

`perf-simple-query` --tablets --smp 1 -m 1G results obtained for
fixed 400MHz frequency:

Before:
```
enable-cache=1
Running test with config: {partitions=10000, concurrency=100, mode=read, frontend=cql, query_single_key=no, counters=no}
Disabling auto compaction
Creating 10000 partitions...

112590.60 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   41353 insns/op,   17992 cycles/op,        0 errors)
122620.68 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   41310 insns/op,   17713 cycles/op,        0 errors)
118169.48 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   41353 insns/op,   17857 cycles/op,        0 errors)
120634.65 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   41328 insns/op,   17733 cycles/op,        0 errors)
117317.18 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   41347 insns/op,   17822 cycles/op,        0 errors)

         throughput: mean=118266.52 standard-deviation=3797.81 median=118169.48 median-absolute-deviation=2368.13 maximum=122620.68 minimum=112590.60
instructions_per_op: mean=41337.86 standard-deviation=18.73 median=41346.89 median-absolute-deviation=14.64 maximum=41352.53 minimum=41309.83
  cpu_cycles_per_op: mean=17823.50 standard-deviation=111.75 median=17821.97 median-absolute-deviation=90.45 maximum=17992.04 minimum=17713.00
```

After
```
enable-cache=1
Running test with config: {partitions=10000, concurrency=100, mode=read, frontend=cql, query_single_key=no, counters=no}
Disabling auto compaction
Creating 10000 partitions...

123689.63 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   40997 insns/op,   17384 cycles/op,        0 errors)
129643.24 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   40997 insns/op,   17325 cycles/op,        0 errors)
128907.27 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   41009 insns/op,   17325 cycles/op,        0 errors)
130342.56 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   40993 insns/op,   17286 cycles/op,        0 errors)
130294.09 tps ( 63.1 allocs/op,   0.0 logallocs/op,  14.1 tasks/op,   40972 insns/op,   17336 cycles/op,        0 errors)

         throughput: mean=128575.36 standard-deviation=2792.75 median=129643.24 median-absolute-deviation=1718.73 maximum=130342.56 minimum=123689.63
instructions_per_op: mean=40993.51 standard-deviation=13.23 median=40996.73 median-absolute-deviation=3.30 maximum=41008.86 minimum=40972.48
  cpu_cycles_per_op: mean=17331.16 standard-deviation=35.02 median=17324.84 median-absolute-deviation=6.49 maximum=17383.97 minimum=17286.33
```

Closes scylladb/scylladb#21918

[avi: patch was co-authored by Łukasz Paszkowski <lukasz.paszkowski@scylladb.com>]
2024-12-30 23:37:46 +02:00
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Kefu Chai
00810e6a01 treewide: include seastar/core/format.hh instead of seastar/core/print.hh
The later includes the former and in addition to `seastar::format()`,
`print.hh` also provides helpers like `seastar::fprint()` and
`seastar::print()`, which are deprecated and not used by scylladb.

Previously, we include `seastar/core/print.hh` for using
`seastar::format()`. and in seastar 5b04939e, we extracted
`seastar::format()` into `seastar/core/format.hh`. this allows us
to include a much smaller header.

In this change, we just include `seastar/core/format.hh` in place of
`seastar/core/print.hh`.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#21574
2024-11-14 17:45:07 +02:00
Avi Kivity
075b13597d serializer: drop dependency on boost ranges
The call to boost::range::for_each is easily replaced with ranged for.

Closes scylladb/scylladb#21422
2024-11-04 17:48:17 +02:00
Michał Chojnowski
c2ba300f1c reader_concurrency_semaphore: in stats, fix swapped count_resources and memory_resources
can_admit_read() returns reason::memory_resources when the permit is queued due
to lack of count resources, and it returns reason::count_resources when the
permit is queued due to lack of memory resources. It's supposed to be the other
way around.

This bug is causing the two counts to be swapped in the stat dumps printed to
the logs when semaphores time out.

Closes scylladb/scylladb#20714
2024-10-09 14:12:01 +03:00
Botond Dénes
40b6616d3d reader_concurrency_semaphore: add bottleneck self-diagnosis to diagnosis dump
There are a few typical cases of bottlenecks, which can be easily
identified when dumping the semaphore diagnostics. Identify and print
these to fast-track investigations.
2024-09-12 08:31:25 -04:00
Botond Dénes
7d2b931619 reader_concurrency_semaphore: include trigger permit in diagnostic dump
In the previous patch, we provided an opportunity for callers to provide
a trigger permit, when calling `maybe_dump_reader_permit_diagnostics()`.
If the caller provided the trigger permit, include its details in the
dump, allowing the identification of the table and code-path of the
permit which triggered the dump.
2024-09-12 08:30:50 -04:00
Botond Dénes
c044904f07 reader_concurrency_semaphore: propagate permit to do_dump_reader_permit_diagnostics()
Will be used in the next patch.
2024-09-12 00:51:56 -04:00
Botond Dénes
67565a5eee reader_concurrency_semaphore: use consistent exception type for timeout
When a read times out, we use different exception types for the permit's
future (if the permit is waiting), or the permit's abort exception _ex
(which is used to abort ongoing reads). This patch changes both to use
named_semaphore_timed_out, which is the more verbose of the two.
2024-09-12 00:51:03 -04:00
Botond Dénes
036d27dc1b reader_concurrency_semaphore: dump diagnostics when non-waiting reader times out
Currently the semaphore only dumps diagnostics when a waiting reader
times out. The diagnostics are also useful when a non-waiting reader
(which is in the process of reading) times out, so also dump diagnostics
in this case.
Change the code to use a switch statement, so future addition of states
don't miss updating this logic.
2024-09-12 00:51:03 -04:00
Kefu Chai
3e84d43f93 treewide: use seastar::format() or fmt::format() explicitly
before this change, we rely on `using namespace seastar` to use
`seastar::format()` without qualifying the `format()` with its
namespace. this works fine until we changed the parameter type
of format string `seastar::format()` from `const char*` to
`fmt::format_string<...>`. this change practically invited
`seastar::format()` to the club of `std::format()` and `fmt::format()`,
where all members accept a templated parameter as its `fmt`
parameter. and `seastar::format()` is not the best candidate anymore.
despite that argument-dependent lookup (ADT for short) favors the
function which is in the same namespace as its parameter, but
`using namespace` makes `seastar::format()` more competitive,
so both `std::format()` and `seastar::format()` are considered
as the condidates.

that is what is happening scylladb in quite a few caller sites of
`format()`, hence ADT is not able to tell which function the winner
in the name lookup:

```
/__w/scylladb/scylladb/mutation/mutation_fragment_stream_validator.cc:265:12: error: call to 'format' is ambiguous
  265 |     return format("{} ({}.{} {})", _name_view, s.ks_name(), s.cf_name(), s.id());
      |            ^~~~~~
/usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/format:4290:5: note: candidate function [with _Args = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
 4290 |     format(format_string<_Args...> __fmt, _Args&&... __args)
      |     ^
/__w/scylladb/scylladb/seastar/include/seastar/core/print.hh:143:1: note: candidate function [with A = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
  143 | format(fmt::format_string<A...> fmt, A&&... a) {
      | ^
```

in this change, we

change all `format()` to either `fmt::format()` or `seastar::format()`
with following rules:
- if the caller expects an `sstring` or `std::string_view`, change to
  `seastar::format()`
- if the caller expects an `std::string`, change to `fmt::format()`.
  because, `sstring::operator std::basic_string` would incur a deep
  copy.

we will need another change to enable scylladb to compile with the
latest seastar. namely, to pass the format string as a templated
parameter down to helper functions which format their parameters.
to miminize the scope of this change, let's include that change when
bumping up the seastar submodule. as that change will depend on
the seastar change.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-09-11 23:21:40 +03:00
Avi Kivity
aa1270a00c treewide: change assert() to SCYLLA_ASSERT()
assert() is traditionally disabled in release builds, but not in
scylladb. This hasn't caused problems so far, but the latest abseil
release includes a commit [1] that causes a 1000 insn/op regression when
NDEBUG is not defined.

Clearly, we must move towards a build system where NDEBUG is defined in
release builds. But we can't just define it blindly without vetting
all the assert() calls, as some were written with the expectation that
they are enabled in release mode.

To solve the conundrum, change all assert() calls to a new SCYLLA_ASSERT()
macro in utils/assert.hh. This macro is always defined and is not conditional
on NDEBUG, so we can later (after vetting Seastar) enable NDEBUG in release
mode.

[1] 66ef711d68

Closes scylladb/scylladb#20006
2024-08-05 08:23:35 +03:00
Botond Dénes
155acbb306 reader_concurrency_semaphore: execution_loop(): move maybe_admit_waiters() to the inner loop
Now that the CPU concurency limit is configurable, new reads might be
ready to execute right after the current one was executed. So move the
poll for admitting new reads into the inner loop, to prevent the
situation where the inner loop yields and a concurrent
do_wait_admission() finds that there are waiters (queued because at the
time they arrived to the semaphore, the _ready_list was not empty) but it
is is possible to admit a new read. When this happens the semaphore will
dump diagnostics to help debug the apparent contradiction, which can
generate a lot of log spam. Moving the poll into the inner loop prevents
the false-positive contradiction detection from firing.

Refs: scylladb/scylladb#19017

Closes scylladb/scylladb#19600
2024-07-04 17:47:52 +03:00
Botond Dénes
07c0a8a6f8 reader_concurrency_semaphore: wire in the configurable cpu concurrency
Before this patch, the semaphore was hard-wired to stop admission, if
there is even a single permit, which is in the need_cpu state.
Therefore, keeping the CPU concurrency at 1.
This patch makes use of the new cpu_concurrency parameter, which was
wired in in the last patches, allowing for a configurable amount of
concurrent need_cpu permits. This is to address workloads where some
small subset of reads are expected to be slow, and can hold up faster
reads behind them in the semaphore queue.
2024-06-27 09:57:11 -04:00
Botond Dénes
59faa6d4ff reader_concurrency_semaphore: add cpu_concurrency constructor parameter
In the case of the user semaphore, this receives the new
reader_concurrency_semaphore_cpu_limit config item.
Not used yet.
2024-06-27 09:57:11 -04:00
Avi Kivity
fdc1449392 treewide: rename flat_mutation_reader_v2 to mutation_reader
flat_mutation_reader_v2 was introduced in a pair of commits in 2021:

  e3309322c3 "Clone flat_mutation_reader related classes into v2 variants"
  08b5773c12 "Adapt flat_mutation_reader_v2 to the new version of the API"

as a replacement for flat_mutation_reader, using range_tombstone_change
instead of range_tombstone to represent represent range tombstones. See
those commits for more information.

The transition was incremental; the last use of the original
flat_mutation_reader was removed in 2022 in commit

  026f8cc1e7 "db: Use mutation_partition_v2 in mvcc"

In turn, flat_mutation_reader was introduced in 2017 in commit

  748205ca75 "Introduce flat_mutation_reader"

To transition from a mutation_reader that nested rows within
a partition in a separate stream, to a flat reader that streamed
partitions and rows in the same stream.

Here, we reclaim the original name and rename the awkward
flat_mutation_reader_v2 to mutation_reader.

Note that mutation_fragment_v2 remains since we still use the original
for compatibilty, sometimes.

Some notes about the transition:

 - files were also renamed. In one case (flat_mutation_reader_test.cc), the
   rename target already existed, so we rename to
    mutation_reader_another_test.cc.

 - a namespace 'mutation_reader' with two definitions existed (in
   mutation_reader_fwd.hh). Its contents was folded into the mutation_reader
   class. As a result, a few #includes had to be adjusted.

Closes scylladb/scylladb#19356
2024-06-21 07:12:06 +03:00
Botond Dénes
ba0cc29d82 reader_concurrency_semaphore: make count parameter live-update
So that the amount of count resources can be changed at run-time,
triggered by a e.g. a config change.
Previous constant-count based constructor is left intact, to avoid
patching all clients, as only a small subset will want the new
functionality.
2024-06-13 01:59:21 -04:00
Botond Dénes
3c813fbb99 reader_concurrency_semaphore: add range param to evict_inactive_reads_for_table()
When the new optional parameter has a value, evict only inactive reads,
whose ranges overlap with the provided range. The range for the inactive
read is provided in `register_inactive_read()`. If the inactive read has
no range, ovarlap is assumed and the read is evicted.
This will be used to evict all inactive reads that could potentially use
a cleaned-up tablet.
2024-04-30 01:31:08 -04:00
Botond Dénes
9e7a957ffb reader_concurrency_semaphore: allow storing a range with the inactive reader
This allows specifying the range the inactive read is reading from. To
be used in the next patch to selectively evict inactive reads whose
range overlaps with a certain (tablet) range.
2024-04-30 01:31:08 -04:00
Botond Dénes
67684308d1 reader_concurrency_semaphore: avoid detach() in inactive_read_handle::abandon()
inactive_read_handle::abandon() evicts and destroyes the inactive-read,
so it is not left behind. Currently, while doing so, it triggers the
inactive_read's own version of abandon(): detach(). The two has bad
interaction when the inactive_read_handle stores the last permit
instance, causing (so far benign) use-after-free. Prevent triggering
detach() to avoid this bad interaction altogether.
2024-04-30 01:31:08 -04:00
Kefu Chai
168ade72f8 treewide: replace formatter<std::string_view> with formatter<string_view>
in in {fmt} before v10, it provides the specialization of `fmt::formatter<..>`
for `std::string_view` as well as the specialization of `fmt::formatter<..>`
for `fmt::string_view` which is an implementation builtin in {fmt} for
compatibility of pre-C++17. and this type is used even if the code is
compiled with C++ stadandard greater or equal to C++17. also, before v10,
the `fmt::formatter<std::string_view>::format()` is defined so it accepts
`std::string_view`. after v10, `fmt::formatter<std::string_view>` still
exists, but it is now defined using `format_as()` machinery, so it's
`format()` method does not actually accept `std::string_view`, it
accepts `fmt::string_view`, as the former can be converted to
`fmt::string_view`.

this is why we can inherit from `fmt::formatter<std::string_view>` and
use `formatter<std::string_view>::format(foo, ctx);` to implement the
`format()` method with {fmt} v9, but we cannot do this with {fmt} v10,
and we would have following compilation failure:

```
FAILED: service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o
/home/kefu/.local/bin/clang++ -DFMT_DEPRECATED_OSTREAM -DFMT_SHARED -DSCYLLA_BUILD_MODE=release -DSEASTAR_API_LEVEL=7 -DSEASTAR_LOGGER_COMPILE_TIME_FMT -DSEASTAR_LOGGER_TYPE_STDOUT -DSEASTAR_SCHEDULING_GROUPS_COUNT=16 -DSEASTAR_SSTRING -DXXH_PRIVATE_API -DCMAKE_INTDIR=\"RelWithDebInfo\" -I/home/kefu/dev/scylladb -I/home/kefu/dev/scylladb/build/gen -I/home/kefu/dev/scylladb/seastar/include -I/home/kefu/dev/scylladb/build/seastar/gen/include -I/home/kefu/dev/scylladb/build/seastar/gen/src -ffunction-sections -fdata-sections -O3 -g -gz -std=gnu++20 -fvisibility=hidden -Wall -Werror -Wextra -Wno-error=deprecated-declarations -Wimplicit-fallthrough -Wno-c++11-narrowing -Wno-deprecated-copy -Wno-mismatched-tags -Wno-missing-field-initializers -Wno-overloaded-virtual -Wno-unsupported-friend -Wno-enum-constexpr-conversion -Wno-unused-parameter -ffile-prefix-map=/home/kefu/dev/scylladb=. -march=westmere -mllvm -inline-threshold=2500 -fno-slp-vectorize -U_FORTIFY_SOURCE -Werror=unused-result -MD -MT service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -MF service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o.d -o service/CMakeFiles/service.dir/RelWithDebInfo/topology_state_machine.cc.o -c /home/kefu/dev/scylladb/service/topology_state_machine.cc
/home/kefu/dev/scylladb/service/topology_state_machine.cc:254:41: error: no matching member function for call to 'format'
  254 |     return formatter<std::string_view>::format(it->second, ctx);
      |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
/usr/include/fmt/core.h:2759:22: note: candidate function template not viable: no known conversion from 'seastar::basic_sstring<char, unsigned int, 15>' to 'const fmt::basic_string_view<char>' for 1st argument
 2759 |   FMT_CONSTEXPR auto format(const T& val, FormatContext& ctx) const
      |                      ^      ~~~~~~~~~~~~
```

because the inherited `format()` method actually comes from
`fmt::formatter<fmt::string_view>`. to reduce the confusion, in this
change, we just inherit from `fmt::format<string_view>`, where
`string_view` is actually `fmt::string_view`. this follows
the document at
https://fmt.dev/latest/api.html#formatting-user-defined-types,
and since there is less indirection under the hood -- we do not
use the specialization created by `FMT_FORMAT_AS` which inherit
from `formatter<fmt::string_view>`, hopefully this can improve
the compilation speed a little bit. also, this change addresses
the build failure with {fmt} v10.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#18299
2024-04-19 07:44:07 +03:00
Botond Dénes
c6cff53771 reader_concurrency_semaphore: use variable reference for metrics
Instead of a functor, for those metrics that just return the value of an
existing member variable. This is ever so slightly more efficient than a
functor.

Closes scylladb/scylladb#17726
2024-03-11 20:47:04 +02:00
Kefu Chai
38ae52d5cd add fmt::formatter for reader_permit::state and reader_resources
before this change, we rely on the default-generated fmt::formatter created
from operator<<, but fmt v10 dropped the default-generated formatter.

in this change, we define formatters for

* reader_permit::state
* reader_resources

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#17707
2024-03-11 09:55:51 +02:00
Patryk Wrobel
1b6ab65c51 reader_concurrency_semaphore.cc: move stringstream content instead of copying it
C++20 introduced a new overload of std::stringstream::str()
that is selected when the mentioned member function is called
on r-value.

The new overload returns a string, that is move-constructed
from the underlying string instead of being copy-constructed.

This change applies std::move() on stringstream objects before
calling str() member function to avoid copying of the underlying
buffer.

Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>

Closes scylladb/scylladb#17064
2024-01-31 09:31:50 +02:00
Michał Jadwiszczak
49544c47a1 reader_concurrency_semaphore: add name of semaphore in tracing messages 2024-01-23 10:25:34 +01:00
Lakshmi Narayanan Sreethar
76f0d5e35b reader_permit: store schema_ptr instead of raw schema pointer
Store schema_ptr in reader permit instead of storing a const pointer to
schema to ensure that the schema doesn't get changed elsewhere when the
permit is holding on to it. Also update the constructors and all the
relevant callers to pass down schema_ptr instead of a raw pointer.

Fixes #16180

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>

Closes scylladb/scylladb#16658
2024-01-11 08:37:56 +02:00
Avi Kivity
7fce057cda database, reader_concurrency_sempaphore: deduplicate reader_concurrency_sempaphore metrics
reader_concurrency_sempaphore are triplicated: each metrics is registered
for streaming, user, and system classes.

To fix, just move the metrics registration from database to
reader_concurrency_sempaphore, so each reader_concurrency_sempaphore
instantiated will register its metrics (if its creator asked for it).

Adjust the names given to reader_concurrency_sempaphore so we don't
change the labels.

scylla-gdb is adjusted to support the new names.
2023-12-13 09:16:18 -05:00
Botond Dénes
e1b30f50be reader_concurrency_semaphore: add register_metrics constructor parameter
To be used in the next patch to control whether the semaphore registers
and exports metrics or not. We want to move metric registration to the
semaphore but we don't want all semaphores to export metrics. The
decision on whether a semaphore should or shouldn't export metrics
should be made on a case-by-case basis so this new parameter has no
default value (except for the for_tests constructor).
2023-12-13 06:25:45 -05:00
Botond Dénes
6829eaad39 reader_concurrency_semaphore: use utils::memory_limit_reached exception
When the kill limit is triggered.
2023-09-27 10:27:32 -04:00
Raphael S. Carvalho
914cbc11cf reader_concurrency_semaphore: Fix stop() in face of evictable reads becoming inactive
Scylla can crash due to a complicated interaction of service level drop,
evictable readers, inactive read registration path.

1) service level drop invoke stop of reader concurrency semaphore, which will
wait for in flight requests

2) turns out it stops first the gate used for closing readers that will
become inactive.

3) proceeds to wait for in-flight reads by closing the reader permit gate.

4) one of evictable reads take the inactive read registration path, and
finds the gate for closing readers closed.

5) flat mutation reader is destroyed, but finds the underlying reader was
not closed gracefully and triggers the abort.

By closing permit gate first, evictable readers becoming inactive will
be able to properly close underlying reader, therefore avoiding the
crash.

Fixes #15534.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#15535
2023-09-25 08:55:50 +03:00
Michał Chojnowski
2000a09859 reader_concurrency_semaphore: fix a deadlock between stop() and execution_loop()
Permits added to `_ready_list` remain there until
executed by `execution_loop()`.
But `execution_loop()` exits when `_stopped == true`,
even though nothing prevents new permits from being added
to `_ready_list` after `stop()` sets `_stopped = true`.

Thus, if there are reads concurrent with `stop()`,
it's possible for a permit to be added to `_ready_list`
after `execution_loop()` has already quit. Such a permit will
never be destroyed, and `stop()` will forever block on
`_permit_gate.close()`.

A natural solution is to dismiss `execution_loop()` only after
it's certain that `_ready_list` won't receive any new permits.
This is guaranteed by `_permit_gate.close()`. After this call completes,
it is certain that no permits *exist*.

After this patch, `execution_loop()` no longer looks at `_stopped`.
It only exits when `_ready_list_cv` breaks, and this is triggered
by `stop()` right after `_permit_gate.close()`.

Fixes #15198

Closes #15199
2023-08-29 08:18:49 +03:00
Kefu Chai
bab16eb30e treewide: remove #includes not use directly
for faster build times and clear inter-module dependencies, we
should not #includes headers not directly used. instead, we should
only #include the headers directly used by a certain compilation
unit.

in this change, the source files under "/compaction" directories
are checked using clangd, which identifies the cases where we have
an #include which is not directly used. all the #includes identified
by clangd are removed. because some source files rely on the incorrectly
included header file, those ones are updated to #include the header
file they directly use.

if a forward declaration suffice, the declaration is added instead.

see also https://clangd.llvm.org/guides/include-cleaner#unused-include-warning

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-07-18 17:36:31 +08:00
Botond Dénes
c4faa05888 reader_concurrency_semaphore: s/description/operation/ in diagnostics dumps
"description" is not the respective column contains, so fix the header.
2023-06-07 14:21:48 +03:00
Pavel Emelyanov
66e43912d6 code: Switch to seastar API level 7
In that level no io_priority_class-es exist. Instead, all the IO happens
in the context of current sched-group. File API no longer accepts prio
class argument (and makes io_intent arg mandatory to impls).

So the change consists of
- removing all usage of io_priority_class
- patching file_impl's inheritants to updated API
- priority manager goes away altogether
- IO bandwidth update is performed on respective sched group
- tune-up scylla-gdb.py io_queues command

The first change is huge and was made semi-autimatically by:
- grep io_priority_class | default_priority_class
- remove all calls, found methods' args and class' fields

Patching file_impl-s is smaller, but also mechanical:
- replace io_priority_class& argument with io_intent* one
- pass intent to lower file (if applicatble)

Dropping the priority manager is:
- git-rm .cc and .hh
- sed out all the #include-s
- fix configure.py and cmakefile

The scylla-gdb.py update is a bit hairry -- it needs to use task queues
list for IO classes names and shares, but to detect it should it checks
for the "commitlog" group is present.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes #13963
2023-06-06 13:29:16 +03:00
Avi Kivity
97694d26c4 Merge 'reader_permit: minor improvements to resource consume/release safety' from Botond Dénes
This PR contains some small improvements to the safety of consuming/releasing resources to/from the semaphore:
* reader_permit: make the low-level `consume()/signal()` API private, making the only user (an RAII class) friend.
* reader_resources: split `reset()` into `noexcept` and potentially throwing variant.
* reader_resources::reset_to(): try harder to avoid calling `consume()` (when the new resource amount is smaller then the previous one)

Closes #13678

* github.com:scylladb/scylladb:
  reader_permit: resource_units::reset_to(): try harder to avoid calling consume()
  reader_permit: split resource_units::reset()
  reader_permit: make consume()/signal() API private
2023-05-14 14:14:23 +03:00
Botond Dénes
b790f14456 reader_concurrency_semaphore: execution_loop(): trigger admission check when _ready_list is empty
The execution loop consumes permits from the _ready_list and executes
them. The _ready_list usually contains a single permit. When the
_ready_list is not empty, new permits are queued until it becomes empty.
The execution loops relies on admission checks triggered by the read
releasing resouces, to bring in any queued read into the _ready_list,
while it is executing the current read. But in some cases the current
read might not free any resorces and thus fail to trigger an admission
check and the currently queued permits will sit in the queue until
another source triggers an admission check.
I don't yet know how this situation can occur, if at all, but it is
reproducible with a simple unit test, so it is best to cover this
corner-case in the off-chance it happens in the wild.
Add an explicit admission check to the execution loop, after the
_ready_list is exhausted, to make sure any waiters that can be admitted
with an empty _ready_list are admitted immediately and execution
continues.

Fixes: #13540

Closes #13541
2023-05-08 17:11:41 +03:00
Botond Dénes
c1e8e86637 reader_concurrency_semaphore: reader_permit: clean-up after failed memory requests
When requesting memory via `reader_permit::request_memory()`, the
requested amount is added to `_requested_memory` member of the permit
impl. This is because multiple concurrent requests may be blocked and
waiting at the same time. When the requests are fulfilled, the entire
amount is consumed and individual requests track their requested amount
with `resource_units` to release later.
There is a corner-case related to this: if a reader permit is registered
as inactive while it is waiting for memory, its active requests are
killed with `std::bad_alloc`, but the `_requested_memory` fields is not
cleared. If the read survives because the killed requests were part of
a non-vital background read-ahead, a later memory request will also
include amount from the failed requests. This extra amount wil not be
released and hence will cause a resource leak when the permit is
destroyed.
Fix by detecting this corner case and clearing the `_requested_memory`
field. Modify the existing unit test for the scenario of a permit
waiting on memory being registered as inactive, to also cover this
corner case, reproducing the bug.

Fixes: #13539

Closes #13679
2023-05-07 14:06:51 +03:00
Avi Kivity
f125a3e315 Merge 'tree: finish the reader_permit state renames' from Botond Dénes
In https://github.com/scylladb/scylladb/pull/13482 we renamed the reader permit states to more descriptive names. That PR however only covered only the states themselves and their usages, as well as the documentation in `docs/dev`.
This PR is a followup to said PR, completing the name changes: renaming all symbols, names, comments etc, so all is consistent and up-to-date.

Closes #13573

* github.com:scylladb/scylladb:
  reader_concurrency_semaphore: misc updates w.r.t. recent permit state name changes
  reader_concurrency_semaphore: update permit members w.r.t. recent permit state name changes
  reader_concurrency_semaphore: update RAII state guard classes w.r.t. recent permit state name changes
  reader_concurrency_semaphore: update API w.r.t. recent permit state name changes
  reader_concurrency_semaphore: update stats w.r.t. recent permit state name changes
2023-05-04 18:29:04 +03:00
Kefu Chai
48387a5a9a reader_concurrency_semaphore: fix signed/unsigned comparision
a signed/unsigned comparsion can overflow. and GCC-13 rightly points
this out. so let's use `std::cmp_greater_equal()` when comparing
unsigned and signed for greater-or-equal.

```
/home/kefu/dev/scylladb/reader_concurrency_semaphore.cc:931:76: error: comparison of integer expressions of different signedness: ‘long int’ and ‘uint64_t’ {aka ‘long unsigned int’} [-Werror=sign-compare]
  931 |     if (_resources.memory <= 0 && (consumed_resources().memory + r.memory) >= get_kill_limit()) [[unlikely]] {
      |                                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~
```

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-04-29 17:02:25 +08:00
Botond Dénes
88c19b23dc reader_permit: resource_units::reset_to(): try harder to avoid calling consume()
Currently, the `reset_to()` implementation calls `consume(new_amount)` (if
not zero), then calls `signal(old_amount)`. This means that even if
`reset_to()` is a net reduction in the amount of resources, there is a
call to `consume()` which can now potentially throw.
Add a special case for when the new amount of resources is strictly
smaller than the old amount. In this case, just call `signal()` with the
difference. This not just avoids a potential `std::bad_alloc`, but also
helps relieving memory pressure when this is most needed, by not failing
calls to release memory.
2023-04-26 07:41:57 -04:00