Commit Graph

1122 Commits

Author SHA1 Message Date
Tomasz Grabiec
8fa704972f loading_cache: Make invalidation take immediate effect
There are two issues with current implementation of remove/remove_if:

  1) If it happens concurrently with get_ptr(), the latter may still
  populate the cache using value obtained from before remove() was
  called. remove() is used to invalidate caches, e.g. the prepared
  statements cache, and the expected semantic is that values
  calculated from before remove() should not be present in the cache
  after invalidation.

  2) As long as there is any active pointer to the cached value
  (obtained by get_ptr()), the old value from before remove() will be
  still accessible and returned by get_ptr(). This can make remove()
  have no effect indefinitely if there is persistent use of the cache.

One of the user-perceived effects of this bug is that some prepared
statements may not get invalidated after a schema change and still use
the old schema (until next invalidation). If the schema change was
modifying UDT, this can cause statement execution failures. CQL
coordinator will try to interpret bound values using old set of
fields. If the driver uses the new schema, the coordinaotr will fail
to process the value with the following exception:

  User Defined Type value contained too many fields (expected 5, got 6)

The patch fixes the problem by making remove()/remove_if() erase old
entries from _loading_values immediately.

The predicate-based remove_if() variant has to also invalidate values
which are concurrently loading to be safe. The predicate cannot be
avaluated on values which are not ready. This may invalidate some
values unnecessarily, but I think it's fine.

Fixes #10117

Message-Id: <20220309135902.261734-1-tgrabiec@scylladb.com>
2022-03-09 16:13:07 +02:00
Piotr Grabowski
0544973b15 utils/rjson.cc: ignore buggy GCC warning
When compiling utils/rjson.cc on GCC, the compilation triggers the
following warning (which becomes a compilation error):

utils/rjson.cc: In function ‘seastar::future<> rjson::print(const value&, seastar::output_stream<char>&, size_t)’:
utils/rjson.cc:239:15: error: typedef ‘using Ch = char’ locally defined but not used [-Werror=unused-local-typedefs]
  239 |         using Ch = char;
      |               ^~

This warning is a false positive. 'using Ch' is actually used internally
by rapidjson::Writer. This is a known GCC bug
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61596), which has not been
fixed since 2014.

I disabled this warning only locally as other code is not affected by
this warning and no other code already disables this warning.

Note that there are some GCC compilation problems still left apart
from this one.

Closes #10158
2022-03-02 19:10:58 +02:00
Nadav Har'El
fa7a302130 cross-tree: split coordinator_result from exceptions.hh
Recently, coordinator_result was introduced as an alternative for
exceptions. It was placed in the main "exceptions/exceptions.hh" header,
which virtually every single source file in Scylla includes.
But unfortunately, it brings in some heavy header files and templates,
leading to a lot of wasted build time - ClangBuildAnalyzer measured that
we include exceptions.hh in 323 source files, taking almost two seconds
each on average.

In this patch, we split the coordinator_result feature into a separate
header file, "exceptions/coordinator_result", and only the few places
which need it include the header file. Unfortunately, some of these
few places are themselves header, so the new header file ends up being
included in 100 source files - but 100 is still much less than 323 and
perhaps we can reduce this number 100 later.

After this patch, the total Scylla object-file size is reduced by 6.5%
(the object size is a proxy for build time, which I didn't directly
measure). ClangBuildAnalyzer reports that now each of the 323 includes
of exceptions.hh only takes 80ms, coordinator_result.hh is only included
100 times, and virtually all the cost to include it comes from Boost's
result.hh (400ms per inclusion).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220228204323.1427012-1-nyh@scylladb.com>
2022-03-02 10:12:57 +02:00
Nadav Har'El
7cf2e5ee5c Merge 'directory_lister: drop abort method and simplify close semantics' from Benny Halevy
This series contains:
- lister: move to utils
  - tidy up the clutter in the root dir
Based on Avi's feedback to `[PATCH 1/1] utils: directory_lister: close: always abort queue` that was sent to the mailing list:
  - directory_lister: drop abort method
  - lister: do not require get after close to fail
- test: lister_test: test_directory_lister_close simplify indentation
  - cosmetic cleanup

Closes #10142

* github.com:scylladb/scylla:
  test: lister_test: test_directory_lister_close simplify indentation
  lister: do not require get after close to fail
  directory_lister: drop abort method
  lister: move to utils
2022-03-01 16:23:47 +02:00
Botond Dénes
f8da0a8d1e Merge "Conceptualize some static assertions" From Pavel Emelyanov
"
Some templates put constraints onto the involved types with the help of
static assertions. Having them in form of concepts is much better.

tests: unit(dev)
"

* 'br-static-assert-to-concept' of https://github.com/xemul/scylla:
  sstables: Remove excessive type-match assertions
  mutation_reader: Sanitize invocable asserion and concept
  code: Convert is_future result_of assertions into invoke_result concept
  code: Convert is_same+result_of assertions into invocable concepts
  code: Convert nothrow construction assertions into concepts
  code: Convert is_integral assertions to concepts
2022-02-28 13:58:01 +02:00
Benny Halevy
41d097ef47 lister: do not require get after close to fail
Currently, the lister test expected get() to
always fail after close(), but it unexpectedly
succeeded if get() was never called before close,
as seen in https://jenkins.scylladb.com/view/master/job/scylla-master/job/next/4587/artifact/testlog/x86_64_debug/lister_test.test_directory_lister_close.4001.log
```
random-seed=1475104835
Generated 719 dir entries
Getting 565 dir entries
Closing directory_lister
Getting 0 dir entries
Closing directory_lister
test/boost/lister_test.cc(190): fatal error: in "test_directory_lister_close": exception std::exception expected but not raised
```

This change relaxes this requirement to keep
close() simple, based on Avi's feedback:

> The user should call close(), and not do it while get() is running, and
> that's it.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-02-28 12:59:08 +02:00
Benny Halevy
00327bfae3 directory_lister: drop abort method
Based on Avi's feedback:
> We generally have a public abort() only if we depend on an external
> event (like data from a tcp socket) that we don't control. But here
> there are no such external events. So why have a public abort() at all?

If needed in the future, we can consider adding
get(abort_source&) to allow aborting get() via
an external event.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-02-28 12:52:47 +02:00
Benny Halevy
ebbbf1e687 lister: move to utils
There's nothing specific to scylla in the lister
classes, they could (and maybe should) be part of
the seastar library.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2022-02-28 12:36:03 +02:00
Tomasz Grabiec
1d75a8c843 lsa: Abort when trying to free a standard allocator object not
allocated through the region

It indicates alloc-dealloc mismatch, and can cause other problems in
the systems like unable to reclaim memory. We want to catch this at
the deallocation site to be able to quickly indentify the offender.

Misbehavior of this sort can cause fake OOMs due to underflow of
_non_lsa_memory_in_use. When it underflows enough,
shard_segment_pool.total_memory() will become 0 and memory reclamation
will stop doing anything.

Refs #10056
2022-02-25 01:42:15 +01:00
Tomasz Grabiec
9dd4153c16 lsa: Abort when _non_lsa_memory_in_use goes negative
It indicates alloc-dealloc mismatch, and can cause other problems in
the systems like unable to reclaim memory. Catch early.

Refs #10056
2022-02-25 01:42:15 +01:00
Pavel Emelyanov
ffbf19ee3c code: Convert is_future result_of assertions into invoke_result concept
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-02-24 19:47:32 +03:00
Pavel Emelyanov
645896335d code: Convert is_same+result_of assertions into invocable concepts
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-02-24 19:46:10 +03:00
Pavel Emelyanov
063da81ab7 code: Convert nothrow construction assertions into concepts
The small_vector also has N>0 constraint that's also converted

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-02-24 19:44:50 +03:00
Pavel Emelyanov
b8401f2ddd code: Convert is_integral assertions to concepts
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-02-24 19:44:29 +03:00
Tomasz Grabiec
e68cf55514 utils: cached_file: Fix alloc-dealloc mismatch during eviction
on_evicted() is invoked in the LSA allocator context, set in the
reclaimer callback instaled by the cache_tracker. However,
cached_pages are allocated in the standard allocator context (note:
page content is allocated inside LSA via lsa_buffer). The LSA region
will happilly deallocate these, thinking that they these are large
objects which were delegated to the standard allocator. But the
_non_lsa_memory_in_use metric will underflow. When it underflows
enough, shard_segment_pool.total_memory() will become 0 and memory
reclamation will stop doing anything, leading to apparent OOM.

The fix is to switch to the standard allocator context inside
cached_page::on_evicted(). evict_range() was also given the same
treatment as a precaution, it currently is only invoked in the
standard allocator context.

Fixes #10056
2022-02-23 18:38:05 +01:00
Avi Kivity
75fb45df1b Merge 'Propagate CQL coordinator timeouts and failures for reads' from Piotr Dulikowski
This PR propagates the read coordinator logic so that read timeout and read failure exceptions are propagated without throwing on the coordinator side.

This PR is only concerned with exceptions which were originally thrown by the coordinator (in read resolvers). Exceptions propagated through RPC and RPC timeouts will still throw, although those exceptions will be caught and converted into exceptions-as-values by read resolvers.

This is a continuation of work started in #10014.

Results of `perf_simple_query --smp 1 --operations-per-shard 1000000` (read workload), compared with merge base (10880fb0a7):

```
BEFORE:
125085.13 tps ( 80.2 allocs/op,  12.2 tasks/op,   49010 insns/op)
125645.88 tps ( 80.2 allocs/op,  12.2 tasks/op,   49008 insns/op)
126148.85 tps ( 80.2 allocs/op,  12.2 tasks/op,   49005 insns/op)
126044.40 tps ( 80.2 allocs/op,  12.2 tasks/op,   49005 insns/op)
125799.75 tps ( 80.2 allocs/op,  12.2 tasks/op,   49003 insns/op)

AFTER:
127557.21 tps ( 80.2 allocs/op,  12.2 tasks/op,   49197 insns/op)
127835.98 tps ( 80.2 allocs/op,  12.2 tasks/op,   49198 insns/op)
127749.81 tps ( 80.2 allocs/op,  12.2 tasks/op,   49202 insns/op)
128941.17 tps ( 80.2 allocs/op,  12.2 tasks/op,   49192 insns/op)
129276.15 tps ( 80.2 allocs/op,  12.2 tasks/op,   49182 insns/op)
```

The PR does not introduce additional allocations on the read happy-path. The number of instructions used grows by about 200 insns/op. The increase in TPS is probably just a measurement error.

Closes #10092

* github.com:scylladb/scylla:
  indexed_table_select_statement: return some exceptions as exception messages
  result_combinators: add result_wrap_unpack
  select_statement: return exceptions as errors in execute_without_checking_exception_message
  select_statement: return exceptions without throwing in do_execute
  select_statement: implement execute_without_checking_exception_message
  select_statement: introduce helpers for working with failed results
  query_pager: resultify relevant methods
  storage_proxy: resultify (do_)query
  storage_proxy: resultify query_singular
  storage_proxy: propagate failed results through query_partition_key_range
  storage_proxy: resultify query_partition_key_range_concurrent
  storage_proxy: modify handle_read_error to also handle exception containers
  abstract_read_executor: return result from execute()
  abstract_read_executor: return and handle result from has_cl()
  storage_proxy: resultify handling errors from read-repair
  abstract_read_executor::reconcile: resultise handling of data_resolver->done()
  abstract_read_executor::execute: resultify handling of data_resolver->done()
  result_combinators: add result_discard_value
  abstract_read_executor: resultify _result_promise
  abstract_read_executor: return result from done()
  abstract_read_resolver: fail promises by passing exception as value
  abstract_read_resolver: resultify promises
  exceptions: make it possible to return read_{timeout,failure}_exception as value
  result_try: add as_inner/clone_inner to handle types
  result_try: relax ConvertWithTo constraint
  exception_container: switch impl to std::shared_ptr and make copyable
  result_loop: add result_repeat
  result_loop: add result_do_until
  result_loop: add result_map_reduce
  utils/result: add utilities for checking/creating rebindable results
2022-02-22 20:58:25 +03:00
Piotr Dulikowski
091b20019b result_combinators: add result_wrap_unpack
Adds a helper combinator utils::result_wrap_unpack which, in contrast to
utils::result_wrap, uses futurize_apply instead of futurize_invoke to
call the wrapped callable.

In short, if utils::result_wrap is used to adapt code like this:

    f.then([] {})
      ->
    f_result.then(utils::result_wrap([] {}))

Then utils::result_wrap_unpack works for the following case:

    f.then_unpack([] (arg1, arg2) {})
      ->
    f_result.then(utils::result_wrap_unpack([] (arg1, arg2) {}))
2022-02-22 16:25:21 +01:00
Piotr Dulikowski
a304fbfed3 result_combinators: add result_discard_value
Adds a utils::result_discard_value, which is an alternative to
future::discard_result which just ignores the "success" value of the
provided result and does not ignore the exception.
2022-02-22 16:08:52 +01:00
Piotr Dulikowski
1e1f5b4a48 result_try: add as_inner/clone_inner to handle types
Adds two methods to result_try's exception handles:

- as_inner: returns a {l,r}-value reference either to the exception
  container, or the exception_ptr. This allows to use them in operations
  which work on both types, e.g. logging.
- clone_inner: returns a copy of the underlying exception container or
  exception ptr.
2022-02-22 16:08:52 +01:00
Piotr Dulikowski
1ed416906e result_try: relax ConvertWithTo constraint
Currently, the catch handlers in result_futurize_try are required to
return a future, although they are always being called with
seastar::futurize_invoke, so if their result is not future it could be
converted to one anyway. This commit relaxes the ConvertsWithTo
constraint in order to allow this conversion.
2022-02-22 16:08:52 +01:00
Piotr Dulikowski
e87cf08591 exception_container: switch impl to std::shared_ptr and make copyable
The exception_container is supposed to be a cheaper, but possibly harder
to use alternative to std::exception_ptr. Before this commit, the
exception was kept behind foreign_ptr<std::unique_ptr<>> so that moving
the container is very cheap. However, the original std::exception_ptr
supports copying in a thread-safe manner, and it turns out that some of
the read coordinator logic intentionally copies the pointer in order to
be able to fail two different promises with the same exception.

The pointer type is changed to std::shared_ptr. Although it uses atomics
for reference counting, this is also probably what std::exception_ptr
does, so the performance should not be worse. The exception stored
inside the container is immutable, so this allows for a non-throwing
implementation of copying.

To encourage moves instead of copying, the copy constructor is deleted
and instead the `clone()` method should be used if it is really
necessary.
2022-02-22 16:08:52 +01:00
Piotr Dulikowski
7afea88dfc result_loop: add result_repeat
Adds a result-aware counterpart to seastar::repeat. The new function
does not base on seastar::repeat, but rather is a rewrite of the
original (using a coroutine instead of an open-coded task). The main
consequence of using a coroutine is that exceptions from AsyncAction
need to be thrown once more.
2022-02-22 16:08:52 +01:00
Piotr Dulikowski
32cbc89779 result_loop: add result_do_until
Adds a result-aware counterpart to seastar::do_until. The new function
does not base on seastar::do_until, but rather is a rewrite of the
original (using a coroutine instead of an open-coded task). The main
consequence of using a coroutine is that exceptions from StopCondition
or AsyncAction need to be thrown once more.
2022-02-22 16:08:52 +01:00
Piotr Dulikowski
4f0a98a829 result_loop: add result_map_reduce
Adds result-aware counterparts to all seastar::map_reduce overloads.

Fortunately, it was possible to implement the functions by basing them
on seastar::map_reduce and get the same number of allocation. The only
exception happens when reducer::get() returns a non-ready future, which
doesn't seem to happen on the read coordinator path.
2022-02-22 16:08:52 +01:00
Piotr Dulikowski
b3a0480439 utils/result: add utilities for checking/creating rebindable results
Adds:

- ResultRebindableTo<L, R>: concept which is satisfied by a pair of
  results which do not necessarily share the same value, but have the
  same error and policy types; a failed result L can be converted to a
  failed result R.
- rebind_result<T, R>: given a value type T and another result R,
  returns a result which can hold T as value and both the same error and
  policy as R.
2022-02-22 16:08:45 +01:00
Avi Kivity
d1a394fd97 loading_cache: fix indentation of timestamped_val and two nested type aliases
timestamped_val (and two other type aliases) are nested inside loading_cache,
but indented as if they were top-level names. Adjust the indent to
avoid confusion.

Closes #10118
2022-02-22 12:20:36 +02:00
Benny Halevy
489e50ef3a utils: uuid: make operator bool explicit
Following up on 69fcc053bb

To prevent unintentional implicit conversions
e.g. to a number.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20220216081623.830627-1-bhalevy@scylladb.com>
2022-02-16 10:19:47 +02:00
Piotr Dulikowski
742f2abfd8 exception_container: do not throw in accept
This commit changes the behavior of `exception_container::accept`. Now,
instead of throwing an `utils::bad_exception_container_access` exception
when the container is empty, the provided visitor is invoked with that
exception instead. There are two reasons for this change:

- The exception_container is supposed to allow handling exceptions
  without using the costly C++'s exception runtime. Although an empty
  container is an edge case, I think it the new behavior is more aligned
  with the class' purpose. The old behavior can be simulated by
  providing a visitor which throws when called with bad access
  exception.

- The new behavior fixes a bug in `result_try`/`result_futurize_try`.
  Before the change, if the `try` block returned a failed result with an
  empty exception container, a bad access exception would either be
  thrown or returned as an exceptional future without being handled by
  the `catch` clauses. Although nobody is supposed to return such
  result<>s on purpose, a moved out result can be returned by accident
  and it's important for the exception handling logic to be correct in
  such a situation.

Tests: unit(dev)

Closes #10086
2022-02-16 10:06:10 +02:00
Benny Halevy
69fcc053bb utils: uuid: add null_uuid
and respective bool predecate and operator
and unit test.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20220215113438.473400-1-bhalevy@scylladb.com>
2022-02-15 18:02:54 +02:00
Avi Kivity
7cc43f8aa8 Merge 'utils: add result_try and result_futurize_try' from Piotr Dulikowski
Adds `utils::result_try` and `utils::result_futurize_try` - functions which allow to convert existing try..catch blocks into a version which handles C++ exceptions, failed results with exception containers and, depending on the function variant, exceptional futures using the same exception handling logic.

For example, you can convert the following try..catch block:

    try {
        return a_function_that_may_throw();
    } catch (const my_exception& ex) {
        return 123;
    } catch (...) {
        throw;
    }

...to this:

    return utils::result_try([&] {
        return a_function_that_may_throw_or_return_a_failed_result();
    },  utils::result_catch<my_exception>([&] (const Ex&) {
        return 123;
    }), utils::result_catch_dots([&] (auto&& handle) {
        return handle.into_result();
    });

Similarly, `utils::result_futurize_try` can be used to migrate `then_wrapped` or `f.handle_exception()` constructs.

As an example of the usability of the new constructs, two places in the current code which need to simultaneously handle exceptions and failed results are converted to use `result_try` and `result_futurize_try`.

Results of `perf_simple_query --smp 1 --operations-per-shard 1000000 --write`:

```
127041.61 tps ( 67.2 allocs/op,  14.2 tasks/op,   52422 insns/op)
126958.60 tps ( 67.2 allocs/op,  14.2 tasks/op,   52409 insns/op)
127088.37 tps ( 67.2 allocs/op,  14.2 tasks/op,   52411 insns/op)
127560.84 tps ( 67.2 allocs/op,  14.2 tasks/op,   52424 insns/op)
127826.61 tps ( 67.2 allocs/op,  14.2 tasks/op,   52406 insns/op)

126801.02 tps ( 67.2 allocs/op,  14.2 tasks/op,   52420 insns/op)
125371.51 tps ( 67.2 allocs/op,  14.2 tasks/op,   52425 insns/op)
126498.51 tps ( 67.2 allocs/op,  14.2 tasks/op,   52427 insns/op)
126359.41 tps ( 67.2 allocs/op,  14.2 tasks/op,   52423 insns/op)
126298.27 tps ( 67.2 allocs/op,  14.2 tasks/op,   52410 insns/op)
```

The number of tasks and allocations is unchanged. The number of instructions per operations seems similar, it may have increased slightly (by 10-20) but it's hard to tell for sure because of the noisiness of the results.

Tests: unit(dev)

Closes #10045

* github.com:scylladb/scylla:
  transport: use result_try in process_request_one
  storage_proxy: use result_futurize_try in mutate_end
  storage_proxy: temporarily throw exception from result in mutate_end
  utils: add result_try and result_futurize_try
2022-02-13 19:38:13 +02:00
Piotr Dulikowski
dd3284ec38 utils/result: optimize result_parallel_for_each
It now resembles the original parallel_for_each more, but uses a
coroutine instead of a custom `task` to collect not-ready futures.
Although the usage of a coroutine saves on allocations, the drawback is
that there is currently no way to co_await on a future and handle its
exception without throwing or without unconditionally allocating a
then_wrapped or handle_exception continuation - so it introduces a
rethrow.

Furthermore, now failed results and exceptions are treated as equals.
Previously, in case one parallel invocation returned failed result and
another returned an exception, the exception would always be returned.
Now, the failed result/exception of the invocation with the lowest index
is always preferred, regardless of the failure type.

The reimplementation manages to save about 350-400 instructions, one
task and one allocation in the perf_simple_query benchmark in write
mode.

Results from `perf_simple_query --smp 1 --operations-per-shard 1000000
--write` (before vs. after):

```
126872.54 tps ( 67.2 allocs/op,  14.2 tasks/op,   52404 insns/op)
126532.13 tps ( 67.2 allocs/op,  14.2 tasks/op,   52408 insns/op)
126864.99 tps ( 67.2 allocs/op,  14.2 tasks/op,   52428 insns/op)
127073.10 tps ( 67.2 allocs/op,  14.2 tasks/op,   52404 insns/op)
126895.85 tps ( 67.2 allocs/op,  14.2 tasks/op,   52411 insns/op)

127894.02 tps ( 66.2 allocs/op,  13.2 tasks/op,   52036 insns/op)
127671.51 tps ( 66.2 allocs/op,  13.2 tasks/op,   52042 insns/op)
127541.42 tps ( 66.2 allocs/op,  13.2 tasks/op,   52044 insns/op)
127409.10 tps ( 66.2 allocs/op,  13.2 tasks/op,   52052 insns/op)
127831.30 tps ( 66.2 allocs/op,  13.2 tasks/op,   52043 insns/op)
```

Test: unit(dev), unit(result_utils_test, debug)
2022-02-10 18:19:08 +01:00
Piotr Dulikowski
6abeec6299 utils/result: split into combinators and loop file
Segregates result utilities into:

- result.hh - basic definitions related to results with exception
  containers,
- result_combinators.hh - combinators for working with results in
  conjunction with futures,
- result_loop.hh - loop-like combinators, currently has only
  result_parallel_for_each.

The motivation for the split is:

1. In headers, usually only result.hh will be needed, so no need to
   force most .cc files to compile definitions from other files,
2. Less files need to be recompiled when a combinator is added to
   result_combinators or result_loop.

As a bonus, `result_with_exception` was moved from `utils::internal` to
just `utils`.
2022-02-10 18:19:05 +01:00
Piotr Dulikowski
8d52ceca50 utils: add result_try and result_futurize_try
Adds result_try and result_futurize_try - functions which allow to
convert existing try..catch blocks into a version which handles C++
exceptions, failed results with exception containers and, depending on
the function variant, exceptional futures.
2022-02-10 17:35:32 +01:00
Piotr Dulikowski
11cb670881 utils: add result utils
Adds a number of utilities for working with boost::outcome::result
combined with exception_container. The utilities are meant to help with
migration of the existing code to use the boost::outcome::result:

- `exception_container_throw_policy` - a NoValuePolicy meant to be used
  as a template parameter for the boost::outcome::result. It protects
  the caller of `result::value()` and `result::error()` methods - if the
  caller wishes to get a value but the result has an error
  (exception_container in our case), the exception in the container will
  be thrown instead. In case it's the other way around,
  boost::outcome::bad_result_access is thrown.
- `result_parallel_for_each` - a version of `parallel_for_each` which is
  aware of results and returns a failed result in case any of the
  parallel invocations return a failed result.
- `result_into_future` - converts a result into a future. If the result
  holds a value, converts it into make_ready_future; if it holds an
  exception, the exception is returned as make_exception_future.
- `then_ok_result` takes a `future<T>` and converts it into
  a `future<result<T>>`.
- `result_wrap` adapts a callable of type `T -> future<result<T>>` and
  returns a callable of type `result<T> -> future<result<T>>`.
2022-02-08 11:08:42 +01:00
Piotr Dulikowski
80f6224959 utils: add exception_container
Adds `exception_container` - a helper type used to hold exceptions as a
value, without involving the std::exception_ptr.

The motivation behind this type is that it allows inspecting exception's
type and value without having to rethrow that exception and catch it,
unlike std::exception_ptr. In our current codebase, some exception
handling paths need to rethrow the exception multiple times in order to
account it into metrics or encode it as an error response to the CQL
client. Some types of exceptions can be thrown very frequently in case
of overload (e.g. timeouts) and inspecting those exceptions with
rethrows can make the overload even worse. For those kinds of exceptions
it is important to handle them as cheaply as possible, and
exception_container used with conjunction with boost::outcome::result
can help achieve that.
2022-02-04 20:18:00 +01:00
Tomasz Grabiec
b734615f51 util: cached_file: Fix corruption after memory reclamation was triggered from population
If memory reclamation is triggered inside _cache.emplace(), the _cache
btree can get corrupted. Reclaimers erase from it, and emplace()
assumes that the tree is not modified during its execution. It first
locates the target node and then does memory allocation.

Fix by running emplace() under allocating section, which disables
memory reclamation.

The bug manifests with assert failures, e.g:

./utils/bptree.hh:1699: void bplus::node<unsigned long, cached_file::cached_page, cached_file::page_idx_less_comparator, 12, bplus::key_search::linear, bplus::with_debug::no>::refill(Less) [Key = unsigned long, T = cached_file::cached_page, Less = cached_file::page_idx_less_comparator, NodeSize = 12, Search = bplus::key_search::linear, Debug = bplus::with_debug::no]: Assertion `p._kids[i].n == this' failed.

Fixes #9915

Message-Id: <20220130175639.15258-1-tgrabiec@scylladb.com>
2022-01-30 19:57:35 +02:00
Tomasz Grabiec
7ee79fa770 logalloc: Add more logging
Message-Id: <20220127232009.314402-1-tgrabiec@scylladb.com>
2022-01-28 14:12:33 +02:00
Nadav Har'El
1ce73c2ab3 Merge 'utils::is_timeout_exception: Ensure we handle nested exception types' from Calle Wilund
Fixes #9922

storage proxy uses is_timeout_exception to traverse different code paths.
a6202ae079 broke this (because bit rot and
intermixing), by wrapping exception for information purposes.

This adds check of nested types in exception handling, as well as a test
for the routine itself.

Closes #9932

* github.com:scylladb/scylla:
  database/storage_proxy: Use "is_timeout_exception" instead of catch match
  utils::is_timeout_exception: Ensure we handle nested exception types
2022-01-18 23:49:41 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Calle Wilund
97bb1be6f7 utils::is_timeout_exception: Ensure we handle nested exception types
Fixes #9922

storage proxy uses is_timeout_exception to traverse different code paths.
a6202ae079 broke this (because bit rot and
intermixing), by wrapping exception for information purposes.

This adds check of nested types in exception handling, as well as a test
for the routine itself.
2022-01-17 08:43:41 +00:00
Avi Kivity
63d254a8d2 Merge 'gms, service: futurize and coroutinize gossiper-related code' from Pavel Solodovnikov
This series greatly reduces gossipers' dependence on `seastar::async` (yet, not completely).

`i_endpoint_state_change_subscriber` callbacks are converted to return futures (again, to get rid of `seastar::async` dependency), all users are adjusted appropriately (e.g. `storage_service`, `cdc::generation_service`, `streaming::stream_manager`, `view_update_backlog_broker` and `migration_manager`).
This includes futurizing and coroutinizing the whole function call chain up to the `i_endpoint_state_change_subscriber` callback functions.

To aid the conversion process, a non-`seastar::async` dependent variant of `utils::atomic_vector::for_each` is introduced (`for_each_futurized`). A different name is used to clearly distinguish converted and non-converted code, so that the last step (remove `seastar::async()` wrappers around callback-calling code in gossiper) is easier. This is left for a follow-up series, though.

Tests: unit(dev)

Closes #9844

* github.com:scylladb/scylla:
  service: storage_service: coroutinize `set_gossip_tokens`
  service: storage_service: coroutinize `leave_ring`
  service: storage_service: coroutinize `handle_state_left`
  service: storage_service: coroutinize `handle_state_leaving`
  service: storage_service: coroutinize `handle_state_removing`
  service: storage_service: coroutinize `do_drain`
  service: storage_service: coroutinize `shutdown_protocol_servers`
  service: storage_service: coroutinize `excise`
  service: storage_service: coroutinize `remove_endpoint`
  service: storage_service: coroutinize `handle_state_replacing`
  service: storage_service: coroutinize `handle_state_normal`
  service: storage_service: coroutinize `update_peer_info`
  service: storage_service: coroutinize `do_update_system_peers_table`
  service: storage_service: coroutinize `update_table`
  service: storage_service: coroutinize `handle_state_bootstrap`
  service: storage_service: futurize `notify_*` functions
  service: storage_service: coroutinize `handle_state_replacing_update_pending_ranges`
  repair: row_level_repair_gossip_helper: coroutinize `remove_row_level_repair`
  locator: reconnectable_snitch_helper: coroutinize `reconnect`
  gms: i_endpoint_state_change_subscriber: make callbacks to return futures
  utils: atomic_vector: introduce future-returning `for_each` function
  utils: atomic_vector: rename `for_each` to `thread_for_each`
  gms: gossiper: coroutinize `start_gossiping`
  gms: gossiper: coroutinize `force_remove_endpoint`
  gms: gossiper: coroutinize `do_status_check`
  gms: gossiper: coroutinize `remove_endpoint`
2022-01-13 23:09:02 +02:00
Nadav Har'El
23e93a26b3 Merge 'Alternator: stream results + chunk results to remove large allocations' from Calle Wilund
Refs: #9555

When running the "Kraken" dynamodb streams test to provoke the issued observed by QA, I noticed on my setup mainly two things: Large allocation stalls (+ warnings) and timeouts on read semaphores in DB.

This tries to address the first issue, partly by making query_result_view serialization using chunked vector instead of linear one, and by introducing a streaming option for json return objects, avoiding linearizing to string before wire.
Note that the latter has some overhead issues of its own, mainly data copying, since we essentially will be triple buffering (local, wrapped http stream, and final output stream). Still, normal string output will typically do a lot of realloc which is potential extra copies as well, so...

This is not really performance tested, but with these tweaks I no longer get large alloc stalls at least, so that is a plus. :-)

Closes #9713

* github.com:scylladb/scylla:
  alternator::executor: Use streamed result for scan etc if large result
  alternator::streams: Use streamed result in get_records if large result
  executor/server: Add routine to make stream object return
  rjson: Add print to stream of rjson::value
  query_idl: Make qr_partition::rows/query_result::partitions chunked
2022-01-12 15:53:31 +02:00
Calle Wilund
e2d7225df8 rjson: Add print to stream of rjson::value
Allows direct stream of object to seastar::stream. While not 100%
efficient, it has the advantage of avoiding large allocations
(long string) for huge result messages.
2022-01-12 13:34:49 +00:00
Avi Kivity
4118f2d8be treewide: replace deprecated seastar::later() with seastar::yield()
seastar::later() was recently deprecated and replaced with two
alternatives: a cheap seastar::yield() and an expensive (but more
powerful) seastar::check_for_io_immediately(), that corresponds to
the original later().

This patch replaces all later() calls with the weaker yield(). In
all cases except one, it's unambiguously correct. In one case
(test/perf scheduling_latency_measurer::stop()) it's not so ambiguous,
since check_for_io_immediately() will additionally force a poll and
so will cause more work to be done (but no additional tasks to be
executed). However, I think that any measurement that relies on
the measuring the work on the last tick to be inaccurate (you need
thousands of ticks to get any amount of confidence in the
measurement) that in the end it doesn't matter what we pick.

Tests: unit (dev)

Closes #9904
2022-01-12 12:19:19 +01:00
Pavel Solodovnikov
adf7138b3b utils: atomic_vector: introduce future-returning for_each function
Introduce a variant of `for_each` function not requiring
`seastar::async` context.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2022-01-11 09:29:12 +03:00
Pavel Solodovnikov
b958e85c54 utils: atomic_vector: rename for_each to thread_for_each
To emphasize that the function requires `seastar::thread`
context to function properly.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2022-01-11 09:29:12 +03:00
Nadav Har'El
3fbbad7d60 build performance: speed up inclusion of <gm/inet_address.hh>
The header file <gm/inet_address.hh> is included, directly or
indirectly, from 291 source files in Scylla. It is hard to reduce this
number because Scylla relies heavily on IP addresses as keys to
different things. So it is important that this header file be fast to
include. Unfortunately it wasn't... ClangBuildAnalyzer measurements
showed that each inclusion of this header file added a whopping 2 seconds
(in dev build mode) to the build. A total of 600 CPU seconds - 10 CPU
minutes - were spent just on this header file. It was actually worse
because the build also spent additional time on template instantiation
(more on this below).

So in this patch we:

1. Remove some unnecessary stuff from gms/inet_address.hh, and avoid
   including it in one place that doesn't need it. This is just
   cosmetic, and doesn't significantly speed up the build.

2. Move the to_sstring() implementation for the .hh to .cc. This saves
   a lot of time on template instantiations - previously every source
   file instantiated this to_sstring(), which was slow (that "format"
   thing is slow).

3. Do not include <seastar/net/ip.hh> which is a huge file including
   half the world. All we need from it is the type "ipv4_address",
   so instead include just the new <seastar/net/ipv4_address.hh>.
   This change brings most of the performance improvement.
   So source files forgot to include various Seastar header files
   because the includes-everything ip.hh did it - so we need to add
   these missing includes in this patch.

After this patch, ClangBuildAnalyzer's reports that the cost of
inclusion of <gms/inet_address.hh> is down from 2 seconds to 0.326
seconds. Additionally the format<inet_address> template instantiation
291 times - about half a second each - is also gone.

All in all, this patch should reduce around 10 CPU minutes from the build.

Refs #1

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-01-04 21:07:23 +02:00
Tomasz Grabiec
7038dc7003 lsa: Fix segment leak on memory reclamation during alloc_buf
alloc_buf() calls new_buf_active() when there is no active segment to
allocate a new active segment. new_buf_active() allocates memory
(e.g. a new segment) so may cause memory reclamation, which may cause
segment compaction, which may call alloc_buf() and re-enter
new_buf_active(). The first call to new_buf_active() would then
override _buf_active and cause the segment allocated during segment
compaction to be leaked.

This then causes abort when objects from the leaked segment are freed
because the segment is expected to be present in _closed_segments, but
isn't. boost::intrusive::list::erase() will fail on assertion that the
object being erased is linked.

Introduced in b5ca0eb2a2.

Fixes #9821
Fixes #9192
Fixes #9825
Fixes #9544
Fixes #9508
Refs #9573

Message-Id: <20211229201443.119812-1-tgrabiec@scylladb.com>
2021-12-30 11:02:08 +02:00
Avi Kivity
9e74556413 Merge 'Support reverse reads in the row cache natively' from Tomasz Grabiec
This change makes row cache support reverse reads natively so that reversing wrappers are not needed when reading from cache and thus the read can be executed efficiently, with similar cost as the forward-order read.

The database is serving reverse reads from cache by default after this. Before, it was bypassing cache by default after 703aed3277.

Refs: #1413

Tests:

  - unit [dev]
  - manual query with build/dev/scylla and cache tracing on

Closes #9454

* github.com:scylladb/scylla:
  tests: row_cache: Extend test_concurrent_reads_and_eviction to run reverse queries
  row_cache: partition_snapshot_row_cursor: Print more details about the current version vector
  row_cache: Improve trace-level logging
  config: Use cache for reversed reads by default
  config: Adjust reversed_reads_auto_bypass_cache description
  row_cache: Support reverse reads natively
  mvcc: partition_snapshot: Support slicing range tombstones in reverse
  test: flat_mutation_reader_assertions: Consume expected range tombstones before end_of_partition
  row_cache: Log produced range tombstones
  test: Make produces_range_tombstone() report ck_ranges
  tests: lib: random_mutation_generator: Extract make_random_range_tombstone()
  partition_snapshot_row_cursor: Support reverse iteration
  utils: immutable-collection: Make movable
  intrusive_btree: Make default-initialized iterator cast to false
2021-12-29 16:53:25 +02:00
Avi Kivity
d40722d598 loading_cache: fix mixup of std::chrono::milliseconds and lowres_clock::duration
lowres_clock uses the two types interchangably, although they are not
defined to be the same. Fix by using only lowres_clock::duration.
2021-12-28 21:19:08 +02:00