This patch enforces a minimum size of unprivileged section when
performing shrink() operation.
When the cache is shrank, we still drop entries first from unprivileged
section (as before this commit), however if this section is already small
(smaller than max_size / 2), we will drop entries from the privileged
section.
For example if the cache could store at most 50 entries and there are 49
entries in privileged section, after adding 5 entries (that would go to
unprivileged section) 4 of them would get evicted and only the 5th one
would stay. This caused problems with BATCH statements where all
prepared statements in the batch have to stay in cache at the same time
for the batch to correctly execute.
New tests are added to check this behavior and bookkeeping of section
sizes.
Fixes#10440.
This patch splits _current_size variable, which tracked the overall size
of cache, to two variables: _unprivileged_section_size and
_privileged_section_size.
Their sum is equal to the old _current_size, but now you can get the
size of each section separately.
lru_entry's cache_size() is replaced with owning_section_size() which
references in which counter the size of lru_entry is currently stored.
There are two issues with current implementation of remove/remove_if:
1) If it happens concurrently with get_ptr(), the latter may still
populate the cache using value obtained from before remove() was
called. remove() is used to invalidate caches, e.g. the prepared
statements cache, and the expected semantic is that values
calculated from before remove() should not be present in the cache
after invalidation.
2) As long as there is any active pointer to the cached value
(obtained by get_ptr()), the old value from before remove() will be
still accessible and returned by get_ptr(). This can make remove()
have no effect indefinitely if there is persistent use of the cache.
One of the user-perceived effects of this bug is that some prepared
statements may not get invalidated after a schema change and still use
the old schema (until next invalidation). If the schema change was
modifying UDT, this can cause statement execution failures. CQL
coordinator will try to interpret bound values using old set of
fields. If the driver uses the new schema, the coordinaotr will fail
to process the value with the following exception:
User Defined Type value contained too many fields (expected 5, got 6)
The patch fixes the problem by making remove()/remove_if() erase old
entries from _loading_values immediately.
The predicate-based remove_if() variant has to also invalidate values
which are concurrently loading to be safe. The predicate cannot be
avaluated on values which are not ready. This may invalidate some
values unnecessarily, but I think it's fine.
Fixes#10117
Message-Id: <20220309135902.261734-1-tgrabiec@scylladb.com>
timestamped_val (and two other type aliases) are nested inside loading_cache,
but indented as if they were top-level names. Adjust the indent to
avoid confusion.
Closes#10118
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
Provide a template parameter to provide a static callbacks object to
increment a counter of evictions from the unprivileged section.
If entries are evicted from the cache while still in the unprivileged
section indicates a not efficient usage of the cache and should be
investigated.
This patch instruments authorized_prepared_statements_cache and a
prepared_statements_cache objects to provide non-empty callbacks.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
This patch implements a simple variation of LFRU eviction policy:
* We define 2 dynamic cache sections which total size should not exceed the maximum cache size.
* New cache entry is always added to the "unprivileged" section.
* After a cache entry is read more than SectionHitThreshold times it moves to the second cache section.
* Both sections' entries obey expiration and reload rules in the same way as before this patch.
* When cache entries need to be evicted due to a size restriction "unprivileged" section's
least recently used entries are evicted first.
Note:
With a 2 sections cache it's not enough for a new entry to have the latest timestamp
in order not be evicted right after insertion: e.g. if all all other entries
are from the privileged section.
And obviously we want to allow new cache entries to be added to a cache.
Therefore we can no longer first add a new entry and then shrink the cache.
Switching the order of these two operations resolves the culprit.
Fixes#8674
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
* Store a reference to a parent (loading_cache) object instead of holding
references to separate fields.
* Access loading_cache fields via accessors.
* Move the LRU "touch" logic to the loading_cache.
* Keep only a plain "list entry" logic in the lru_entry class.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Hide internal classes inside the loading_cache class:
* Simpler calls - no need for a tricky back-referencing to access loading_cache fields.
* Cleaner interface.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Use std::pmr::polymorphic_allocator instead of
std::allocator - the former allows not to define the
allocated object during the template specification.
As a result we won't have to have lru_entry defined
before loading_cache, which in line would allow us
to rearrange classes making all classes internal to
loading_cache and hence simplifying the interface.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
loading_shared_values/loading_cache'es iterators interface is dangerous/fragile because
iterator doesn't "lock" the entry it points to and if there is a
preemption point between aquiring non-end() iterator and its
dereferencing the corresponding cache entry may had already got evicted (for
whatever reason, e.g. cache size constraints or expiration) and then
dereferencing may end up in a use-after-free and we don't have any
protection against it in the value_extractor_fn today.
And this is in addition to #8920.
So, instead of trying to fix the iterator interface this patch kills two
birds in a single shot: we are ditching the iterators interface
completely and return value_ptr from find(...) instead - the same one we
are returning from loading_cache::get_ptr(...) asyncronous APIs.
A similar rework is done to a loading_shared_values loading_cache is
based on: we drop iterators interface and return
loading_shared_values::entry_ptr from find(...) instead.
loading_cache::value_ptr already takes care of "lock"ing the returned value so that it
would relain readable even if it's evicted from the cache by the time
one tries to read it. And of course it also takes care of updating the
last read time stamp and moving the corresponding item to the top of the
MRU list.
Fixes#8920
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <20210817222404.3097708-1-vladz@scylladb.com>
util/loading_cache.hh includes adjusted.
* seastar 02ad74fa7d...eb452a22a0 (17):
> core: add missing include for std::allocator_traits
> exceptions: move timed_out_error and factory into its own header file
> future: parallel_for_each: add disable_failure_guard for parallel_for_each_state
> Merge "Improve file API noexcept correctness" from Rafael
> util: Add a with_allocation_failures helper
> future: Fix indentation
> future: Refactor duplicated try/catch
> future: Make set_to_current_exception public
> future: Add noexcept to continuation related functions
> core: mark timer cancellation functions as noexcept
> future: Simplify future::schedule
> test: add a case for overwriting exact routes
> http: throw on duplicated routes to prevent memory leaks
> metrics: Remove the type label
> fstream: turn file_data_source_impl's memory corruption bugs into aborts
> doc: update tutorial splitting script
> reactor_backend: let the reactor know again if any work was done by aio backend
This removes the need to include reactor.hh, a source of compile
time bloat.
In some places, the call is qualified with seastar:: in order
to resolve ambiguities with a local name.
Includes are adjusted to make everything compile. We end up
having 14 translation units including reactor.hh, primarily for
deprecated things like reactor::at_exit().
Ref #1
This patch silences those future discard warnings where it is clear that
discarding the future was actually the intent of the original author,
*and* they did the necessary precautions (handling errors). The patch
also adds some trivial error handling (logging the error) in some
places, which were lacking this, but otherwise look ok. No functional
changes.
Reloading may hold value in the underlying loading_shared_values while
the corresponding cache values have already been deleted.
This may create weird situations like this:
<populate cache with 10 entries>
cache.remove(key1);
for (auto& e : cache) {
std::out << e << std::endl;
}
<all 10 entries are printed, including the one for "key1">
In order to avoid such situations we are going to make the loading_cache::iterator
to be a transform_iterator of lru_list::iterator instead of loading_shared_values::iterator
because lru_list contains entries only for cached items.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
reloading flow may hold the items in the underlying loading_shared_values
after they have been removed (e.g. via remove(key) API) thereby loading_shared_values.size()
doesn't represent the correct value for the loading_cache. lru_list.size() on the other hand - does.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
This allows to remove the requirement to hold the key value inside the
_load callback if its value is needed in the asynchronous continuation
inside the callback in the context of a reload.
This also resolves the use-after-free issue when a _load() callback removes
the item for a given key.
See a9b72db34d.1528794135.git.bdenes%40scylladb.com
for a discussion about this.
In addition this patch makes the loading_cache more robust for any existing
and potential situations when cached entries are being removed from inside the
callback. This is achieved by extending the idea implemented by Duarte in the
"utils/loading_cache: Avoid using invalidated iterators" by capturing timestamped_val_ptr
(which is essentially a lw_shared_ptr to an intrusive set entry which holds both the key
and the cached value) instead of a naked pointer.
Tests {debug, release}:
- Unit tests:
- loading_cache_test
- view_build_test
- auth_test
- auth_resource_test
- dtest:
- auth_test.py
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
The list of elements that needs to be reloaded may be rather large.
Use chunked_vector in order to make the allocator's life easier.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
When periodically reloading the values in the loading_cache, we would
iterate over the list of entries and call the load() function for
those which need to be reloaded.
For some concrete caches, load() can remove the entry from the LRU set,
and can be executed inline from the parallel_for_each(). This means we
could potentially keep iterating using an invalidated iterator.
Fix this by using a temporary container to hold those entries to be
reloaded.
Spotted when reading the code.
Also use if constexpr and fix the comment in the function containing
the changes.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180712124143.13638-1-duarte@scylladb.com>
The continuation attached to _load() needs the key of the loaded entry
to check whether it was disposed during the load. However if _load()
invalidates the entry the continuation's capture line will access
invalid memory while trying to obtain the key.
To avoid this save a copy of the key before calling _load() and pass it
to both _load() and the continuation.
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <b571b73076ca863690f907fbd3fb4ff54e597b28.1531393608.git.bdenes@scylladb.com>
This overload alows searching the elements by an arbitrary key as long as it is "hashable"
to the same values as the default key and if there is a comparator for
this new key.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
remove(key): removes the entry with the given key if exists, otherwise does nothing.
remote(iterator): removes an entry by a given iterator (returned from loading_cache::find()).
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Ensure that the size of the cache is never bigger than the "max_size".
Before this patch the size of the cache could have been indefinitely bigger than
the requested value during the refresh time period which is clearly an undesirable
behaviour.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Sometimes we don't want the cached values to be periodically reloaded.
This patch adds the ability to control this using a ReloadEnabled template parameter.
In case the reloading is not needed the "loading" function is not given to the constructor
but rather to the get_ptr(key, loader) method (currently it's the only method that is used, we may add
the corresponding get(key, loader) method in the future when needed).
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Current get(...) interface restricts the cache to work only with copy-constructable
values (it returns future<Tp>).
To make it able to work with non-copyable value we need to introduce an interface that would
return something like a reference to the cached value (like regular containers do).
We can't return future<Tp&> since the caller would have to ensure somehow that the underlying
value is still alive. The much more safe and easy-to-use way would be to return a shared_ptr-like
pointer to that value.
"Luckily" to us we value we actually store in a cache is already wrapped into the lw_shared_ptr
and we may simply return an object that impersonates itself as a smart_pointer<Tp> value while
it keeps a "reference" to an object stored in the cache.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Allow a variable entry size parameter.
Provide an EntrySize functor that would return a size for a
specific entry.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Get rid of the "proprietary" solution for asynchronous values on-demand loading.
Use utils::loading_shared_values instead.
We would still need to maintain intrusive set and list for efficient shrink and invalidate
operations but their entry is not going to contain the actual key and value anymore
but rather a loading_shared_values::entry_ptr which is essentially a shared pointer to a key-value
pair value.
In general, we added another level of dereferencing in order to get the key value but since
we use the bi::store_hash<true> in the hook and the bi::compare_hash<true> in the bi::unordered_set
this should not translate into an additional set lookup latency.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
The timer is armed inside the section guarded by the _timer_reads_gate
therefore it has to be canceled after the gate is closed.
Otherwise we may end up with the armed timer after stop() method has
returned a ready future.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1501603059-32515-1-git-send-email-vladz@scylladb.com>
loading_cache invokes a timer that may issue asynchronous operations
(queries) that would end with writing into the internal fields.
We have to ensure that these operations are over before we can destroy
the loading_cache object.
Fixes#2624
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Message-Id: <1501096256-10949-1-git-send-email-vladz@scylladb.com>
Arm the timer with a period that is not greater than either the permissions_validity_in_ms
or the permissions_update_interval_in_ms in order to ensure that we are not stuck with
the values older than permissions_validity_in_ms.
Fixes#2590
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Fix the shrink() O(n log n) complexity issue by constantly pushing the corresponding intrusive
list entry to the head of the list every time the values are read.
This will keep the list ordered by the last read time from the most recently read
to the least recently read entry.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
- Start the cache with 256 buckets - the minimum number of buckets.
- Limit the maximal number of buckets by 1M buckets.
- Keep the load factor between 0.25 and 1.0 as long as the number of buckets is
between the minimum and the maximum values mentioned above.
- Grow and shrink the hash every "refresh" period if needed.
- Enable bi::power_2_buckets and bi::compare_hash bi::unordered_set options.
- Enable bi::unordered_set_base_hook's bi::store_hash option.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
Make the underlying map to be a boost::intrusive::unordered_set<timestamped_val>
instead of std::unordered_set<Key, timestamped_val>.
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>