Currently, for users who have permissions_cache configs set to very high
values (and thus can't wait for the configured times to pass) having to restart
the service every time they make a change related to permissions or
prepared_statements cache (e.g. Adding a user and changing their permissions)
can become pretty annoying.
This patch series make permissions_validity_in_ms, permissions_update_interval_in_ms
and permissions_cache_max_entries live updateable so that restarting the
service is not necessary anymore for these cases.
It also adds an API for flushing the cache to make it easier for users who
don't want to modify their permissions_cache config.
branch: https://github.com/igorribeiroduarte/scylla/tree/make_permissions_cache_live_updateable
CI: https://jenkins.scylladb.com/job/releng/job/Scylla-CI/1005/
dtests: https://github.com/igorribeiroduarte/scylla-dtest/tree/test_permissions_cache
* https://github.com/igorribeiroduarte/scylla/make_permissions_cache_live_updateable:
loading_cache_test: Test loading_cache::reset and loading_cache::update_config
api: Add API for resetting authorization cache
authorization_cache: Make permissions cache and authorized prepared statements cache live updateable
auth_prep_statements_cache: Make aut_prep_statements_cache accept a config struct
utils/loading_cache.hh: Add update_config method
utils/loading_cache.hh: Rename permissions_cache_config to loading_cache_config and move it to loading_cache.hh
utils/loading_cache.hh: Add reset method
Currently, for users who have permissions_cache configs set to very high
values (and thus can't wait for the configured times to pass) having to restart
the service every time they make a change related to permissions or
prepared_statements cache(e.g.: Adding a user) can become pretty annoying.
This patch make permissions_validity_in_ms, permissions_update_interval_in_ms
and permissions_cache_max_entries live updateable so that restarting the
service is not necessary anymore for these cases.
Signed-off-by: Igor Ribeiro Barbosa Duarte <igor.duarte@scylladb.com>
This patch adds an update_config method in order to allow live updating the
config for permissions_cache. This method is going to be used in the next
patches after making permissions_cache config live updateable.
Signed-off-by: Igor Ribeiro Barbosa Duarte <igor.duarte@scylladb.com>
This patch renames the permissions_cache_config struct to loading_cache_config
and moves it to utils/loading_cache.hh. This will make it easier to handle
config updates to the authorization caches on the next patches
Signed-off-by: Igor Ribeiro Barbosa Duarte <igor.duarte@scylladb.com>
- Use `sstables::generation_type` in more places
- Enforce conceptual separation of `sstables::generation_type` and `int64_t`
- Fix `extremum_tracker` so that `sstables::generation_type` can be non-default-constructible
Fixes#10796.
Closes#10844
* github.com:scylladb/scylla:
sstables: make generation_type an actual separate type
sstables: use generation_type more soundly
extremum_tracker: do not require default-constructible value types
chunked_vector was headed by short comment which didn't really explain
why it exists and how and why it really differs from std::dequeue.
Moreover, it made the vague claim that it "limits" contiguous
allocations, which it really doesn't (at least not in the asymptotic
sense).
In this patch I wrote a much longer comment, which I hope will clearly
explain exactly what chunked_vector is, how it really differs in its
contiguous allocations from std::deque, and what it guarantees and
doesn't guarantee.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#10857
This patch adds a reset method which is going to be used in the next patches
for updating the loadind_cache config and also to be able to flush the cache
without having to update scylla config
Signed-off-by: Igor Ribeiro Barbosa Duarte <igor.duarte@scylladb.com>
If the len2 argument to crc32_combine() is zero, then the crc2
argument must also be zero.
fast_crc32_combine() explicitly checks for len2==0, in which case it
ignores crc2 (which is the same as if it were zero).
zlib's crc32_combine() used to have that check prior to version
1.2.12, but then lost it, making its necessary for callers to be more
careful.
Also add the len2==0 check to the dummy fast_crc32_combine()
implementation, because it delegates to zlib's.
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
Closes#10731
The CPU cost of iterating over the relevant ELF structures is probably
negligible (despite the amount of code involved), but there is no need
to keep the containing page mapped in RAM when it doesn't have to be.
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
Current code already assumes (correctly), that shrink() does not
throw, otherwise we risk leaking memory allocated in get_ptr():
```
ts_value_lru_entry* new_lru_entry = Alloc().template allocate_object<ts_value_lru_entry>();
// Remove the least recently used items if map is too big.
shrink();
```
Let's be explicit and mark shrink() and a few helper methods
that it uses as noexcept. Ultimately they are all noexcept anyway,
because polymorphic allocator's deallocation routines don't throw,
and neither do boost intrusive list iterators.
Closes#10565
After fcb8d040 ("treewide: use Software Package Data Exchange
(SPDX) license identifiers"), many dual-licensed files were
left with empty comments on top. Remove them to avoid visual
noise.
Closes#10562
primitive_consumer::read_bytes() destroys and creates a vector for every value it reads.
This happens for every cell.
We can save a bit of work by reusing the vector.
Closes#10512
* github.com:scylladb/scylla:
sstables: consumer: reuse the fragmented_temporary_buffer in read_bytes()
utils: fragmented_temporary_buffer: add release()
This series enforces a minimum size of the unprivileged section when
performing `shrink()` operation.
When the cache is shrunk, we still drop entries first from unprivileged
section (as before this commit), however, if this section is already small
(smaller than `max_size / 2`), we will drop entries from the privileged
section.
This is necessary, as before this change the unprivileged section could
be starved. For example if the cache could store at most 50 entries and
there are 49 entries in privileged section, after adding 5 entries (that would
go to unprivileged section) 4 of them would get evicted and only the 5th one
would stay. This caused problems with BATCH statements where all
prepared statements in the batch have to stay in cache at the same time
for the batch to correctly execute.
To correctly check if the unprivileged section might get too small after
dropping an entry, `_current_size` variable, which tracked the overall size
of cache, is changed to two variables: `_unprivileged_section_size` and
`_privileged_section_size`, tracking section sizes separately.
New tests are added to check this new behavior and bookkeeping of the section
sizes. A test is added, that sets up a CQL environment with a very small
prepared statement cache, reproduces issue in #10440 and stresses the cache.
Fixes#10440.
Closes#10456
* github.com:scylladb/scylla:
loading_cache_test: test prepared stmts cache
loading_cache: force minimum size of unprivileged
loading_cache: extract dropping entries to lambdas
loading_cache: separately track size of sections
loading_cache: fix typo in 'privileged'
This patch enforces a minimum size of unprivileged section when
performing shrink() operation.
When the cache is shrank, we still drop entries first from unprivileged
section (as before this commit), however if this section is already small
(smaller than max_size / 2), we will drop entries from the privileged
section.
For example if the cache could store at most 50 entries and there are 49
entries in privileged section, after adding 5 entries (that would go to
unprivileged section) 4 of them would get evicted and only the 5th one
would stay. This caused problems with BATCH statements where all
prepared statements in the batch have to stay in cache at the same time
for the batch to correctly execute.
New tests are added to check this behavior and bookkeeping of section
sizes.
Fixes#10440.
This patch splits _current_size variable, which tracked the overall size
of cache, to two variables: _unprivileged_section_size and
_privileged_section_size.
Their sum is equal to the old _current_size, but now you can get the
size of each section separately.
lru_entry's cache_size() is replaced with owning_section_size() which
references in which counter the size of lru_entry is currently stored.
Seastar is an external library from Scylla's point of view so
we should use the angle bracket #include style. Most of the source
follows this, this patch fixes a few stragglers.
Also fix cases of #include which reached out to seastar's directory
tree directly, via #include "seastar/include/sesatar/..." to
just refer to <seastar/...>.
Closes#10433
Checking a concept in a requires-expression requires an additional
requires keyword. Moreover, the constraint is incorrect (at least
all callers pass a T, not a result<T>), so remove it.
Found by gcc 12.
Fixes the case of make_room() invoked with last_chunk_capacity_deficit
but _size not in the last reserved chunk.
Found during code review, no user impact.
Fixes#10364.
Message-Id: <20220411224741.644113-1-tgrabiec@scylladb.com>
Fixes the case of make_room() invoked with last_chunk_capacity_deficit
but _size not in the last reserved chunk.
Found during code review, no known user impact.
Fixes#10363.
Message-Id: <20220411222605.641614-1-tgrabiec@scylladb.com>
"
The database::shutdown() and ::drain() methods are called inside the
invoke_on_all()s synchronizing with each other via the cross-shard
_stop_barrier.
If either shard throws in between all others may get stuck waiting for
the barrier to collect all arrivals. To fix it the throwing shard
should wake up others, resolving the wait somehow.
The fix is actually patch #4, the first and the second are the abort()
method for the barrier itself.
Fixes: #10304
tests: unit(dev), manual
"
* 'br-barrier-exception-2' of https://github.com/xemul/scylla:
database: Abort barriers on exception
database: Coroutinize close_tables
test: Add test for cross_shard_barrier::abort()
cross-shard-barrier: Add .abort() method
If reserve() allocates more than one chunk, push_back() should not
work with the last chunk. This can result in items being pushed to the
wrong chunk, breaking internal invariants.
Also, pop_back() should not work with the last chunk. This breaks when
there is more than one chunk.
Currently, the container is only used in the sstable partition index
cache.
Manifests by crashes in sstable reader which touch sstables which have
partition index pages with more than 1638 partition entries.
Introduced in 78e5b9fd85 (4.6.0)
Fixes#10290
Message-Id: <20220407174023.527059-1-tgrabiec@scylladb.com>
The method makes all the .arrive_and_wait()s in the current phase
to resolve with barrier_aborted_exception() exceptional future.
The barrier turns into a broken state and is not supposed to serve
any subsequence arrivals anyhow reasonably.
The .abort() method is re-entrable in two senses. The first is that
more than one shard can abort a barrier, which is pretty natural.
The second is that the exception-safety fuses like that imply that
if the arrive_and_wait() resolves into exception the caller will try
to abort() the barrier as well, even though the phase would be over.
This case is also "supported".
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
"
This patchset adds two new operations to scylla-sstable:
* validate-checksums - helps identifying whether an sstable is intact or
not, but checking the digest and the per-chunk checksums against the
data on disk.
* decompress - helps when one wants to manually examine the content of a
compressed sstable.
Refs: #497
Tests: unit(dev)
"
* 'scylla-sstable-validate-checksums-decompress/v3' of https://github.com/denesb/scylla:
tools/scylla-sstable: consume_sstables(): s/no_skips/use_crawling_reader/
tools/scylla-sstable: add decompress operation
tools/scylla-sstables: add validate-checksums operation
sstables/sstable: add validate_checksums()
sstables/sstable: add raw_stream option to data_stream()
sstables/sstable: make data_stream() and data_read() public
utils/exceptions: add maybe_rethrow_exception()
Looks like internal boost.outcome headers don't include some
of needed dependencies, so do that manually in our headers.
For some reason it worked before, but started to fail when
building on Fedora 36 setup.
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
The `result_try` and `result_futurize_try` are supposed to handle both
failed results and exceptions in a way similar to a try..catch block.
In order to catch exceptions, the metaprogramming machinery invokes the
fallible code inside a stack of try..catch blocks, each one of them
handling one exception. This is done instead of creating a single
try..catch block, as to my knowledge it is not possible to create
a try..catch block with the number of "catch" clauses depending on a
variadic template parameter pack.
Unfortunately, a "try" with multiple "catches" is not functionally
equivalent to a "try block stack". Consider the following code:
try {
try {
return execute_try_block();
} catch (const derived_exception&) {
// 1
}
} catch (const base_exception&) {
// 2
}
If `execute_try_block` throws `derived_exception` and the (1) catch
handler rethrows this exception, it will also be handled in (2), which
is not the same behavior as if the try..catch stack was "flat".
This causes wrong behavior in `result_try` and `result_futurize_try`.
The following snippet has the same, wrong behavior as the previous one:
return utils::result_try([&] {
return execute_try_block();
}, utils::result_catch<derived_exception>([&] (const auto&& ex) {
// 1
}), utils::result_catch<base_exception>([&] (const auto&& ex) {
// 2
});
This commit fixes the problem by adding a boolean flag which is set just
before a catch handler is executed. If another catch handler is
accidentally matched due to exception rethrow, the catch handler is
skipped and exception is automatically rethrown.
Tests: unit(dev, debug)
Fixes: #10211Closes#10216
There are two issues with current implementation of remove/remove_if:
1) If it happens concurrently with get_ptr(), the latter may still
populate the cache using value obtained from before remove() was
called. remove() is used to invalidate caches, e.g. the prepared
statements cache, and the expected semantic is that values
calculated from before remove() should not be present in the cache
after invalidation.
2) As long as there is any active pointer to the cached value
(obtained by get_ptr()), the old value from before remove() will be
still accessible and returned by get_ptr(). This can make remove()
have no effect indefinitely if there is persistent use of the cache.
One of the user-perceived effects of this bug is that some prepared
statements may not get invalidated after a schema change and still use
the old schema (until next invalidation). If the schema change was
modifying UDT, this can cause statement execution failures. CQL
coordinator will try to interpret bound values using old set of
fields. If the driver uses the new schema, the coordinaotr will fail
to process the value with the following exception:
User Defined Type value contained too many fields (expected 5, got 6)
The patch fixes the problem by making remove()/remove_if() erase old
entries from _loading_values immediately.
The predicate-based remove_if() variant has to also invalidate values
which are concurrently loading to be safe. The predicate cannot be
avaluated on values which are not ready. This may invalidate some
values unnecessarily, but I think it's fine.
Fixes#10117
Message-Id: <20220309135902.261734-1-tgrabiec@scylladb.com>
When compiling utils/rjson.cc on GCC, the compilation triggers the
following warning (which becomes a compilation error):
utils/rjson.cc: In function ‘seastar::future<> rjson::print(const value&, seastar::output_stream<char>&, size_t)’:
utils/rjson.cc:239:15: error: typedef ‘using Ch = char’ locally defined but not used [-Werror=unused-local-typedefs]
239 | using Ch = char;
| ^~
This warning is a false positive. 'using Ch' is actually used internally
by rapidjson::Writer. This is a known GCC bug
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61596), which has not been
fixed since 2014.
I disabled this warning only locally as other code is not affected by
this warning and no other code already disables this warning.
Note that there are some GCC compilation problems still left apart
from this one.
Closes#10158
Recently, coordinator_result was introduced as an alternative for
exceptions. It was placed in the main "exceptions/exceptions.hh" header,
which virtually every single source file in Scylla includes.
But unfortunately, it brings in some heavy header files and templates,
leading to a lot of wasted build time - ClangBuildAnalyzer measured that
we include exceptions.hh in 323 source files, taking almost two seconds
each on average.
In this patch, we split the coordinator_result feature into a separate
header file, "exceptions/coordinator_result", and only the few places
which need it include the header file. Unfortunately, some of these
few places are themselves header, so the new header file ends up being
included in 100 source files - but 100 is still much less than 323 and
perhaps we can reduce this number 100 later.
After this patch, the total Scylla object-file size is reduced by 6.5%
(the object size is a proxy for build time, which I didn't directly
measure). ClangBuildAnalyzer reports that now each of the 323 includes
of exceptions.hh only takes 80ms, coordinator_result.hh is only included
100 times, and virtually all the cost to include it comes from Boost's
result.hh (400ms per inclusion).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20220228204323.1427012-1-nyh@scylladb.com>
This series contains:
- lister: move to utils
- tidy up the clutter in the root dir
Based on Avi's feedback to `[PATCH 1/1] utils: directory_lister: close: always abort queue` that was sent to the mailing list:
- directory_lister: drop abort method
- lister: do not require get after close to fail
- test: lister_test: test_directory_lister_close simplify indentation
- cosmetic cleanup
Closes#10142
* github.com:scylladb/scylla:
test: lister_test: test_directory_lister_close simplify indentation
lister: do not require get after close to fail
directory_lister: drop abort method
lister: move to utils
"
Some templates put constraints onto the involved types with the help of
static assertions. Having them in form of concepts is much better.
tests: unit(dev)
"
* 'br-static-assert-to-concept' of https://github.com/xemul/scylla:
sstables: Remove excessive type-match assertions
mutation_reader: Sanitize invocable asserion and concept
code: Convert is_future result_of assertions into invoke_result concept
code: Convert is_same+result_of assertions into invocable concepts
code: Convert nothrow construction assertions into concepts
code: Convert is_integral assertions to concepts
Currently, the lister test expected get() to
always fail after close(), but it unexpectedly
succeeded if get() was never called before close,
as seen in https://jenkins.scylladb.com/view/master/job/scylla-master/job/next/4587/artifact/testlog/x86_64_debug/lister_test.test_directory_lister_close.4001.log
```
random-seed=1475104835
Generated 719 dir entries
Getting 565 dir entries
Closing directory_lister
Getting 0 dir entries
Closing directory_lister
test/boost/lister_test.cc(190): fatal error: in "test_directory_lister_close": exception std::exception expected but not raised
```
This change relaxes this requirement to keep
close() simple, based on Avi's feedback:
> The user should call close(), and not do it while get() is running, and
> that's it.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Based on Avi's feedback:
> We generally have a public abort() only if we depend on an external
> event (like data from a tcp socket) that we don't control. But here
> there are no such external events. So why have a public abort() at all?
If needed in the future, we can consider adding
get(abort_source&) to allow aborting get() via
an external event.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
There's nothing specific to scylla in the lister
classes, they could (and maybe should) be part of
the seastar library.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
allocated through the region
It indicates alloc-dealloc mismatch, and can cause other problems in
the systems like unable to reclaim memory. We want to catch this at
the deallocation site to be able to quickly indentify the offender.
Misbehavior of this sort can cause fake OOMs due to underflow of
_non_lsa_memory_in_use. When it underflows enough,
shard_segment_pool.total_memory() will become 0 and memory reclamation
will stop doing anything.
Refs #10056
on_evicted() is invoked in the LSA allocator context, set in the
reclaimer callback instaled by the cache_tracker. However,
cached_pages are allocated in the standard allocator context (note:
page content is allocated inside LSA via lsa_buffer). The LSA region
will happilly deallocate these, thinking that they these are large
objects which were delegated to the standard allocator. But the
_non_lsa_memory_in_use metric will underflow. When it underflows
enough, shard_segment_pool.total_memory() will become 0 and memory
reclamation will stop doing anything, leading to apparent OOM.
The fix is to switch to the standard allocator context inside
cached_page::on_evicted(). evict_range() was also given the same
treatment as a precaution, it currently is only invoked in the
standard allocator context.
Fixes#10056
This PR propagates the read coordinator logic so that read timeout and read failure exceptions are propagated without throwing on the coordinator side.
This PR is only concerned with exceptions which were originally thrown by the coordinator (in read resolvers). Exceptions propagated through RPC and RPC timeouts will still throw, although those exceptions will be caught and converted into exceptions-as-values by read resolvers.
This is a continuation of work started in #10014.
Results of `perf_simple_query --smp 1 --operations-per-shard 1000000` (read workload), compared with merge base (10880fb0a7):
```
BEFORE:
125085.13 tps ( 80.2 allocs/op, 12.2 tasks/op, 49010 insns/op)
125645.88 tps ( 80.2 allocs/op, 12.2 tasks/op, 49008 insns/op)
126148.85 tps ( 80.2 allocs/op, 12.2 tasks/op, 49005 insns/op)
126044.40 tps ( 80.2 allocs/op, 12.2 tasks/op, 49005 insns/op)
125799.75 tps ( 80.2 allocs/op, 12.2 tasks/op, 49003 insns/op)
AFTER:
127557.21 tps ( 80.2 allocs/op, 12.2 tasks/op, 49197 insns/op)
127835.98 tps ( 80.2 allocs/op, 12.2 tasks/op, 49198 insns/op)
127749.81 tps ( 80.2 allocs/op, 12.2 tasks/op, 49202 insns/op)
128941.17 tps ( 80.2 allocs/op, 12.2 tasks/op, 49192 insns/op)
129276.15 tps ( 80.2 allocs/op, 12.2 tasks/op, 49182 insns/op)
```
The PR does not introduce additional allocations on the read happy-path. The number of instructions used grows by about 200 insns/op. The increase in TPS is probably just a measurement error.
Closes#10092
* github.com:scylladb/scylla:
indexed_table_select_statement: return some exceptions as exception messages
result_combinators: add result_wrap_unpack
select_statement: return exceptions as errors in execute_without_checking_exception_message
select_statement: return exceptions without throwing in do_execute
select_statement: implement execute_without_checking_exception_message
select_statement: introduce helpers for working with failed results
query_pager: resultify relevant methods
storage_proxy: resultify (do_)query
storage_proxy: resultify query_singular
storage_proxy: propagate failed results through query_partition_key_range
storage_proxy: resultify query_partition_key_range_concurrent
storage_proxy: modify handle_read_error to also handle exception containers
abstract_read_executor: return result from execute()
abstract_read_executor: return and handle result from has_cl()
storage_proxy: resultify handling errors from read-repair
abstract_read_executor::reconcile: resultise handling of data_resolver->done()
abstract_read_executor::execute: resultify handling of data_resolver->done()
result_combinators: add result_discard_value
abstract_read_executor: resultify _result_promise
abstract_read_executor: return result from done()
abstract_read_resolver: fail promises by passing exception as value
abstract_read_resolver: resultify promises
exceptions: make it possible to return read_{timeout,failure}_exception as value
result_try: add as_inner/clone_inner to handle types
result_try: relax ConvertWithTo constraint
exception_container: switch impl to std::shared_ptr and make copyable
result_loop: add result_repeat
result_loop: add result_do_until
result_loop: add result_map_reduce
utils/result: add utilities for checking/creating rebindable results
Adds a helper combinator utils::result_wrap_unpack which, in contrast to
utils::result_wrap, uses futurize_apply instead of futurize_invoke to
call the wrapped callable.
In short, if utils::result_wrap is used to adapt code like this:
f.then([] {})
->
f_result.then(utils::result_wrap([] {}))
Then utils::result_wrap_unpack works for the following case:
f.then_unpack([] (arg1, arg2) {})
->
f_result.then(utils::result_wrap_unpack([] (arg1, arg2) {}))