scylladb

Author	SHA1	Message	Date
Piotr Grabowski	3f2224a47f	loading_cache: force minimum size of unprivileged This patch enforces a minimum size of unprivileged section when performing shrink() operation. When the cache is shrank, we still drop entries first from unprivileged section (as before this commit), however if this section is already small (smaller than max_size / 2), we will drop entries from the privileged section. For example if the cache could store at most 50 entries and there are 49 entries in privileged section, after adding 5 entries (that would go to unprivileged section) 4 of them would get evicted and only the 5th one would stay. This caused problems with BATCH statements where all prepared statements in the batch have to stay in cache at the same time for the batch to correctly execute. New tests are added to check this behavior and bookkeeping of section sizes. Fixes #10440.	2022-04-29 19:19:04 +02:00
Piotr Grabowski	06612ddf1c	loading_cache: extract dropping entries to lambdas Extract the logic of dropping an entry from privileged/unprivileged sections to a separate named local lambdas.	2022-04-29 19:19:03 +02:00
Piotr Grabowski	bebc4c8147	loading_cache: separately track size of sections This patch splits _current_size variable, which tracked the overall size of cache, to two variables: _unprivileged_section_size and _privileged_section_size. Their sum is equal to the old _current_size, but now you can get the size of each section separately. lru_entry's cache_size() is replaced with owning_section_size() which references in which counter the size of lru_entry is currently stored.	2022-04-29 19:19:03 +02:00
Piotr Grabowski	fe9b62bc99	loading_cache: fix typo in 'privileged' Fix typo from 'priviledged' to 'privileged'.	2022-04-28 17:51:26 +02:00
Tomasz Grabiec	8fa704972f	loading_cache: Make invalidation take immediate effect There are two issues with current implementation of remove/remove_if: 1) If it happens concurrently with get_ptr(), the latter may still populate the cache using value obtained from before remove() was called. remove() is used to invalidate caches, e.g. the prepared statements cache, and the expected semantic is that values calculated from before remove() should not be present in the cache after invalidation. 2) As long as there is any active pointer to the cached value (obtained by get_ptr()), the old value from before remove() will be still accessible and returned by get_ptr(). This can make remove() have no effect indefinitely if there is persistent use of the cache. One of the user-perceived effects of this bug is that some prepared statements may not get invalidated after a schema change and still use the old schema (until next invalidation). If the schema change was modifying UDT, this can cause statement execution failures. CQL coordinator will try to interpret bound values using old set of fields. If the driver uses the new schema, the coordinaotr will fail to process the value with the following exception: User Defined Type value contained too many fields (expected 5, got 6) The patch fixes the problem by making remove()/remove_if() erase old entries from _loading_values immediately. The predicate-based remove_if() variant has to also invalidate values which are concurrently loading to be safe. The predicate cannot be avaluated on values which are not ready. This may invalidate some values unnecessarily, but I think it's fine. Fixes #10117 Message-Id: <20220309135902.261734-1-tgrabiec@scylladb.com>	2022-03-09 16:13:07 +02:00
Pavel Emelyanov	645896335d	code: Convert is_same+result_of assertions into invocable concepts Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-02-24 19:46:10 +03:00
Avi Kivity	d1a394fd97	loading_cache: fix indentation of timestamped_val and two nested type aliases timestamped_val (and two other type aliases) are nested inside loading_cache, but indented as if they were top-level names. Adjust the indent to avoid confusion. Closes #10118	2022-02-22 12:20:36 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Avi Kivity	d40722d598	loading_cache: fix mixup of std::chrono::milliseconds and lowres_clock::duration lowres_clock uses the two types interchangably, although they are not defined to be the same. Fix by using only lowres_clock::duration.	2021-12-28 21:19:08 +02:00
Vlad Zolotarov	4cb245fe3c	loading_cache: account unprivileged section evictions Provide a template parameter to provide a static callbacks object to increment a counter of evictions from the unprivileged section. If entries are evicted from the cache while still in the unprivileged section indicates a not efficient usage of the cache and should be investigated. This patch instruments authorized_prepared_statements_cache and a prepared_statements_cache objects to provide non-empty callbacks. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2021-11-29 21:45:53 -05:00
Vlad Zolotarov	1a9c6d9fd3	loading_cache: implement a variation of least frequent recently used (LFRU) eviction policy This patch implements a simple variation of LFRU eviction policy: * We define 2 dynamic cache sections which total size should not exceed the maximum cache size. * New cache entry is always added to the "unprivileged" section. * After a cache entry is read more than SectionHitThreshold times it moves to the second cache section. * Both sections' entries obey expiration and reload rules in the same way as before this patch. * When cache entries need to be evicted due to a size restriction "unprivileged" section's least recently used entries are evicted first. Note: With a 2 sections cache it's not enough for a new entry to have the latest timestamp in order not be evicted right after insertion: e.g. if all all other entries are from the privileged section. And obviously we want to allow new cache entries to be added to a cache. Therefore we can no longer first add a new entry and then shrink the cache. Switching the order of these two operations resolves the culprit. Fixes #8674 Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2021-11-29 21:45:21 -05:00
Vlad Zolotarov	cbabde9622	loading_cache::timestamped::lru_entry: refactoring * Store a reference to a parent (loading_cache) object instead of holding references to separate fields. * Access loading_cache fields via accessors. * Move the LRU "touch" logic to the loading_cache. * Keep only a plain "list entry" logic in the lru_entry class. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2021-11-29 14:24:56 -05:00
Vlad Zolotarov	9125b4545e	loading_cache.hh: rearrange the code (no functional change) Hide internal classes inside the loading_cache class: * Simpler calls - no need for a tricky back-referencing to access loading_cache fields. * Cleaner interface. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2021-11-29 14:24:56 -05:00
Vlad Zolotarov	fd92718f48	loading_cache: use std::pmr::polymorphic_allocator Use std::pmr::polymorphic_allocator instead of std::allocator - the former allows not to define the allocated object during the template specification. As a result we won't have to have lru_entry defined before loading_cache, which in line would allow us to rearrange classes making all classes internal to loading_cache and hence simplifying the interface. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2021-11-29 14:24:56 -05:00
Vlad Zolotarov	7bd1bcd779	loading_shared_values/loading_cache: get rid of iterators interface and return value_ptr from find(...) instead loading_shared_values/loading_cache'es iterators interface is dangerous/fragile because iterator doesn't "lock" the entry it points to and if there is a preemption point between aquiring non-end() iterator and its dereferencing the corresponding cache entry may had already got evicted (for whatever reason, e.g. cache size constraints or expiration) and then dereferencing may end up in a use-after-free and we don't have any protection against it in the value_extractor_fn today. And this is in addition to #8920. So, instead of trying to fix the iterator interface this patch kills two birds in a single shot: we are ditching the iterators interface completely and return value_ptr from find(...) instead - the same one we are returning from loading_cache::get_ptr(...) asyncronous APIs. A similar rework is done to a loading_shared_values loading_cache is based on: we drop iterators interface and return loading_shared_values::entry_ptr from find(...) instead. loading_cache::value_ptr already takes care of "lock"ing the returned value so that it would relain readable even if it's evicted from the cache by the time one tries to read it. And of course it also takes care of updating the last read time stamp and moving the corresponding item to the top of the MRU list. Fixes #8920 Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <20210817222404.3097708-1-vladz@scylladb.com>	2021-08-22 16:49:40 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Avi Kivity	c97924b8ad	Update seastar submodule util/loading_cache.hh includes adjusted. * seastar 02ad74fa7d...eb452a22a0 (17): > core: add missing include for std::allocator_traits > exceptions: move timed_out_error and factory into its own header file > future: parallel_for_each: add disable_failure_guard for parallel_for_each_state > Merge "Improve file API noexcept correctness" from Rafael > util: Add a with_allocation_failures helper > future: Fix indentation > future: Refactor duplicated try/catch > future: Make set_to_current_exception public > future: Add noexcept to continuation related functions > core: mark timer cancellation functions as noexcept > future: Simplify future::schedule > test: add a case for overwriting exact routes > http: throw on duplicated routes to prevent memory leaks > metrics: Remove the type label > fstream: turn file_data_source_impl's memory corruption bugs into aborts > doc: update tutorial splitting script > reactor_backend: let the reactor know again if any work was done by aio backend	2020-08-04 17:54:45 +03:00
Avi Kivity	88ade3110f	treewide: replace calls to engine().some_api() with some_api() This removes the need to include reactor.hh, a source of compile time bloat. In some places, the call is qualified with seastar:: in order to resolve ambiguities with a local name. Includes are adjusted to make everything compile. We end up having 14 translation units including reactor.hh, primarily for deprecated things like reactor::at_exit(). Ref #1	2020-04-05 12:46:04 +03:00
Botond Dénes	fddd9a88dd	treewide: silence discarded future warnings for legit discards This patch silences those future discard warnings where it is clear that discarding the future was actually the intent of the original author, and they did the necessary precautions (handling errors). The patch also adds some trivial error handling (logging the error) in some places, which were lacking this, but otherwise look ok. No functional changes.	2019-08-26 18:54:44 +03:00
Vlad Zolotarov	945d26e4ee	loading_cache: make iterator work on top of lru_list iterators instead of loading_shared_values' Reloading may hold value in the underlying loading_shared_values while the corresponding cache values have already been deleted. This may create weird situations like this: <populate cache with 10 entries> cache.remove(key1); for (auto& e : cache) { std::out << e << std::endl; } <all 10 entries are printed, including the one for "key1"> In order to avoid such situations we are going to make the loading_cache::iterator to be a transform_iterator of lru_list::iterator instead of loading_shared_values::iterator because lru_list contains entries only for cached items. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2018-08-30 20:56:44 -04:00
Vlad Zolotarov	1e56c7dd58	loading_cache: make size() return the size of lru_list instead of loading_shared_values reloading flow may hold the items in the underlying loading_shared_values after they have been removed (e.g. via remove(key) API) thereby loading_shared_values.size() doesn't represent the correct value for the loading_cache. lru_list.size() on the other hand - does. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2018-08-30 15:55:30 -04:00
Vlad Zolotarov	235520292e	utils::loading_cache: hold a shared_value_ptr to the value when we reload This allows to remove the requirement to hold the key value inside the _load callback if its value is needed in the asynchronous continuation inside the callback in the context of a reload. This also resolves the use-after-free issue when a _load() callback removes the item for a given key. See `a9b72db34d`.1528794135.git.bdenes%40scylladb.com for a discussion about this. In addition this patch makes the loading_cache more robust for any existing and potential situations when cached entries are being removed from inside the callback. This is achieved by extending the idea implemented by Duarte in the "utils/loading_cache: Avoid using invalidated iterators" by capturing timestamped_val_ptr (which is essentially a lw_shared_ptr to an intrusive set entry which holds both the key and the cached value) instead of a naked pointer. Tests {debug, release}: - Unit tests: - loading_cache_test - view_build_test - auth_test - auth_resource_test - dtest: - auth_test.py Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2018-07-13 11:27:58 -04:00
Vlad Zolotarov	b44ad5677a	utils::loading_cache::on_timer(): remove not needed capture of "this" Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2018-07-13 11:27:43 -04:00
Vlad Zolotarov	4aa0e5914b	utils::loading_cache::on_timer(): use chunked_vector for storing elements we want to reload The list of elements that needs to be reloaded may be rather large. Use chunked_vector in order to make the allocator's life easier. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2018-07-13 09:53:59 -04:00
Duarte Nunes	63b63b0461	utils/loading_cache: Avoid using invalidated iterators When periodically reloading the values in the loading_cache, we would iterate over the list of entries and call the load() function for those which need to be reloaded. For some concrete caches, load() can remove the entry from the LRU set, and can be executed inline from the parallel_for_each(). This means we could potentially keep iterating using an invalidated iterator. Fix this by using a temporary container to hold those entries to be reloaded. Spotted when reading the code. Also use if constexpr and fix the comment in the function containing the changes. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20180712124143.13638-1-duarte@scylladb.com>	2018-07-12 13:59:09 +01:00
Botond Dénes	2e7bf9c6f9	loading_cache::reload(): obtain key before calling _load() The continuation attached to _load() needs the key of the loaded entry to check whether it was disposed during the load. However if _load() invalidates the entry the continuation's capture line will access invalid memory while trying to obtain the key. To avoid this save a copy of the key before calling _load() and pass it to both _load() and the continuation. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <b571b73076ca863690f907fbd3fb4ff54e597b28.1531393608.git.bdenes@scylladb.com>	2018-07-12 13:42:42 +01:00
Duarte Nunes	1fb3b924f4	utils/loading_cache: Remove superfluous continuation Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20180712122031.13424-1-duarte@scylladb.com>	2018-07-12 15:22:35 +03:00
Vlad Zolotarov	3114cef42c	loading_shared_values: introduce the templated find() overload This overload alows searching the elements by an arbitrary key as long as it is "hashable" to the same values as the default key and if there is a comparator for this new key. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2018-05-22 20:15:00 -04:00
Vlad Zolotarov	34620deee4	utils::loading_cache: add remove(key)/remove(iterator) methods remove(key): removes the entry with the given key if exists, otherwise does nothing. remote(iterator): removes an entry by a given iterator (returned from loading_cache::find()). Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2018-05-22 20:05:00 -04:00
Jesse Haber-Kucharsky	6f4241574c	utils/loading_cache: Include necessary dependency	2017-11-15 23:17:05 -05:00
Tomasz Grabiec	68fe1a5bee	utils/loading_cache: Fix compilation on older compilers Message-Id: <1507728312-10585-1-git-send-email-tgrabiec@scylladb.com>	2017-10-12 14:55:34 +03:00
Botond Dénes	d2b294dc06	loading_cache: prepend this-> to method calls on captured this To make gcc 6.3 happy. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <849402e20a1ffa6f603eff4fe295981a94b9ca79.1507282527.git.bdenes@scylladb.com>	2017-10-06 12:09:34 +02:00
Vlad Zolotarov	1394e781be	utils + cql3: use a functor class instead of std::function Define value_extractor_fn as a functor class instead of std::function. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <1507137724-2408-2-git-send-email-vladz@scylladb.com>	2017-10-05 15:29:51 +01:00
Vlad Zolotarov	9a43398d6a	utils::loading_cache: make the size limitation more strict Ensure that the size of the cache is never bigger than the "max_size". Before this patch the size of the cache could have been indefinitely bigger than the requested value during the refresh time period which is clearly an undesirable behaviour. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-09-15 20:53:11 -04:00
Vlad Zolotarov	4e72a56310	utils::loading_cache: added static_asserts for checking the callbacks signatures Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-09-15 20:53:11 -04:00
Vlad Zolotarov	a13362e74b	utils::loading_cache: add a bunch of standard synchronous methods Add a few standard synchronous methods to the cache, e.g. find(), remove_if(), etc. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-09-15 20:53:11 -04:00
Vlad Zolotarov	fa2f8162a5	utils::loading_cache: add the ability to create a cache that would not reload the values Sometimes we don't want the cached values to be periodically reloaded. This patch adds the ability to control this using a ReloadEnabled template parameter. In case the reloading is not needed the "loading" function is not given to the constructor but rather to the get_ptr(key, loader) method (currently it's the only method that is used, we may add the corresponding get(key, loader) method in the future when needed). Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-09-15 20:53:11 -04:00
Vlad Zolotarov	a60a77dfc8	utils::loading_cache: add the ability to work with not-copy-constructable values Current get(...) interface restricts the cache to work only with copy-constructable values (it returns future<Tp>). To make it able to work with non-copyable value we need to introduce an interface that would return something like a reference to the cached value (like regular containers do). We can't return future<Tp&> since the caller would have to ensure somehow that the underlying value is still alive. The much more safe and easy-to-use way would be to return a shared_ptr-like pointer to that value. "Luckily" to us we value we actually store in a cache is already wrapped into the lw_shared_ptr and we may simply return an object that impersonates itself as a smart_pointer<Tp> value while it keeps a "reference" to an object stored in the cache. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-09-15 20:53:11 -04:00
Vlad Zolotarov	c24d85f632	utils::loading_cache: add EntrySize template parameter Allow a variable entry size parameter. Provide an EntrySize functor that would return a size for a specific entry. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-09-15 20:53:11 -04:00
Vlad Zolotarov	6024014f92	utils::loading_cache: rework on top of utils::loading_shared_values Get rid of the "proprietary" solution for asynchronous values on-demand loading. Use utils::loading_shared_values instead. We would still need to maintain intrusive set and list for efficient shrink and invalidate operations but their entry is not going to contain the actual key and value anymore but rather a loading_shared_values::entry_ptr which is essentially a shared pointer to a key-value pair value. In general, we added another level of dereferencing in order to get the key value but since we use the bi::store_hash<true> in the hook and the bi::compare_hash<true> in the bi::unordered_set this should not translate into an additional set lookup latency. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-09-15 20:53:11 -04:00
Vlad Zolotarov	4b28ea216d	utils::loading_cache: cancel the timer after closing the gate The timer is armed inside the section guarded by the _timer_reads_gate therefore it has to be canceled after the gate is closed. Otherwise we may end up with the armed timer after stop() method has returned a ready future. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <1501603059-32515-1-git-send-email-vladz@scylladb.com>	2017-08-01 17:21:44 +01:00
Vlad Zolotarov	9adabd1bc4	utils::loading_cache: add stop() method loading_cache invokes a timer that may issue asynchronous operations (queries) that would end with writing into the internal fields. We have to ensure that these operations are over before we can destroy the loading_cache object. Fixes #2624 Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <1501096256-10949-1-git-send-email-vladz@scylladb.com>	2017-07-26 21:28:49 +02:00
Vlad Zolotarov	76ea74f3fd	utils::loading_cache: arm the timer with a period equal to min(_expire, _update) Arm the timer with a period that is not greater than either the permissions_validity_in_ms or the permissions_update_interval_in_ms in order to ensure that we are not stuck with the values older than permissions_validity_in_ms. Fixes #2590 Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-07-13 10:48:59 -04:00
Vlad Zolotarov	121e3c7b8f	utils::loading_cache: make a timer use a loading_cache_clock_type clock as a source Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-07-13 10:42:12 -04:00
Vlad Zolotarov	1ae40ee91a	utils::timestamped_val: fix the touch() comment The current comment has been written when the function has not been a timestamped_val member. Let's adjust it to the current code. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com> Message-Id: <1495555659-10881-1-git-send-email-vladz@scylladb.com>	2017-05-26 19:26:56 +03:00
Vlad Zolotarov	2d4d198fb9	utils::loading_cache: cleanup - Remove "_" at the beginning of the type names. - s/Pred/EqualPred/ Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-05-22 23:02:18 -04:00
Vlad Zolotarov	fd59a548c0	utils/loading_cache.hh: use intrusive list to store the lru entry Fix the shrink() O(n log n) complexity issue by constantly pushing the corresponding intrusive list entry to the head of the list every time the values are read. This will keep the list ordered by the last read time from the most recently read to the least recently read entry. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-05-22 23:00:18 -04:00
Vlad Zolotarov	0c4e9efce7	utils::loading_cache: implement automatic rehashing - Start the cache with 256 buckets - the minimum number of buckets. - Limit the maximal number of buckets by 1M buckets. - Keep the load factor between 0.25 and 1.0 as long as the number of buckets is between the minimum and the maximum values mentioned above. - Grow and shrink the hash every "refresh" period if needed. - Enable bi::power_2_buckets and bi::compare_hash bi::unordered_set options. - Enable bi::unordered_set_base_hook's bi::store_hash option. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-05-22 22:57:44 -04:00
Vlad Zolotarov	2be3596a4f	utils::loading_cache: make the underlying map to be an intrusive unordered_set Make the underlying map to be a boost::intrusive::unordered_set<timestamped_val> instead of std::unordered_set<Key, timestamped_val>. Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-05-22 18:45:13 -04:00
Vlad Zolotarov	6a63c87a9f	utils::loading_cache: avoid the reads storm when the key is not in the cache Use a mutex to serialize producers when the key is not present in the cache. Fixes #2262 Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>	2017-05-18 07:55:48 -04:00

1 2

54 Commits