scylladb

Author	SHA1	Message	Date
Botond Dénes	11109f4c45	mutation_reader: move mutation source into readers/	2022-03-30 15:42:51 +03:00
Botond Dénes	0b5217052d	querier: switch to v2 compactor output The change is mostly mechanical: update all compactor instances to the _v2 variant and update all call-sites, of which there is not that many. As a consequence of this patch, queries -- both single-partition and range-scans -- now do the v2->v1 conversion in the consumers, instead of in the compactor.	2022-03-11 09:24:05 +02:00
Botond Dénes	f2e2b84038	multishard_mutation_query: migrate to v2 Mostly mechanical transformation. The main difference is in the detached compaction state, from which we now get the range tombstone change, instead of the range tombstone list. The code around this is a bit awkward, will become simpler when compactor drops v1 support.	2022-02-21 12:29:24 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Botond Dénes	85c42a5d76	querier: convert querier_cache and {data,mutation}_querier to v2 The shard_mutation_querier is left using a v1 reader in its API as the multishard query code is not ready yet. When saving this reader it is upgraded to v2 and on lookup it is downgraded to v1. This should cancel out thanks to upgrade/downgrade unwrapping.	2022-01-07 13:52:26 +02:00
Botond Dénes	aa3c943f4c	mutation_reader: remove unecessary stable_flattened_mutations_consumer Said wrapper was conceived to make unmovable `compact_mutation` because readers wanted movable consumers. But `compact_mutation` is movable for years now, as all its unmovable bits were moved into an `lw_shared_ptr<>` member. So drop this unnecessary wrapper and its unnecessary usages.	2022-01-07 13:52:07 +02:00
Botond Dénes	e8a918b25c	compact_mutation: make start_new_page() independent of mutation_fragment version By using partition_region instead of mutation_fragment::kind. This will make incremental migration of users to v2 easier.	2022-01-07 13:47:39 +02:00
Botond Dénes	953603199e	multishard_combining_reader: reader_lifecycle_policy: allow saving read range on fast-forward The reader_lifecycle_policy API was created around the idea of shard readers (optionally) being saved and reused on the next page. To do this, the lifecycle policy has to also be able to control the lifecycle of by-reference parameters of readers: the slice and the range. This was possible from day 1, as the readers are created through the lifecycle policy, which can intercept and replace the said parameters with copies that are created in stable storage. There was one whole in the design though: fast-forwarding, which can change the range of the read, without the lifecycle policy knowing about this. In practice this results in fast-forwarded readers being saved together with the wrong range, their range reference becoming stale. The only lifecycle implementation prone to this is the one in `multishard_mutation_query.cc`, as it is the only one actually saving readers. It will fast-forward its reader when the query happens over multiple ranges. There were no problems related to this so far because no one passes more than one range to said functions, but this is incidental. This patch solves this by adding an `update_read_range()` method to the lifecycle policy, allowing the shard reader to update the read range when being fast forwarded. To allow the shard reader to also have control over the lifecycle of this range, a shared pointer is used. This control is required because when an `evictable_reader` is the top-level reader on the shard, it can invoke `create_reader()` with an edited range after `update_read_range()`, replacing the fast-forwarded-to range with a new one, yanking it out from under the feet of the evictable reader itself. By using a shared pointer here, we can ensure the range stays alive while it is the current one.	2021-12-03 10:27:44 +02:00
Botond Dénes	42b677ef6f	querier: consume_page(): remove now unused max_size parameter	2021-09-29 12:15:48 +03:00
Botond Dénes	41facb3270	treewide: move reversing to the mutation sources Push down reversing to the mutation-sources proper, instead of doing it on the querier level. This will allow us to test reverse reads on the mutation source level. The `max_size` parameter of `consume_page()` is now unused but is not removed in this patch, it will be removed in a follow-up to reduce churn.	2021-09-29 12:15:45 +03:00
Botond Dénes	eba46e353d	querier: add tracepoint with compaction stats Add the content of the compaction stats introduced in the previous patch to the tracing data. This will help diagnose query performance related problems caused by tombstones.	2021-09-22 14:00:05 +03:00
Botond Dénes	350440b418	flat_mutation_reader: make_reversing_reader(): take ownership of the reader Makes for much simpler client code.	2021-09-09 15:42:15 +03:00
Botond Dénes	502a45ad58	treewide: switch to native reversed format for reverse reads We define the native reverse format as a reversed mutation fragment stream that is identical to one that would be emitted by a table with the same schema but with reversed clustering order. The main difference to the current format is how range tombstones are handled: instead of looking at their start or end bound depending on the order, we always use them as-usual and the reversing reader swaps their bounds to facilitate this. This allows us to treat reversed streams completely transparently: just pass along them a reversed schema and all the reader, compacting and result building code is happily ignorant about the fact that it is a reversed stream.	2021-09-09 15:42:15 +03:00
Benny Halevy	4476800493	flat_mutation_reader: get rid of timeout parameter Now that the timeout is taken from the reader_permit. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 16:30:51 +03:00
Benny Halevy	9b0b13c450	reader_concurrency_semaphore: adjust reactivated reader timeout Update the reader's timeout where needed after unregistering inactive_read. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 16:30:51 +03:00
Botond Dénes	f37e26c73d	querier: remove now unused cache_context	2021-07-14 16:48:43 +03:00
Botond Dénes	6efb278ea3	querier_cache: insert(): close refused queriers The querier cache refuses to cache queriers that read in reverse. These queriers are also not closed, with the caller having no way to determine whether the querier it just moved into `insert()` needs a close afterwards or not, requiring a `close()` on the moved-from querier just to be sure. Avoid this by consistently closing all passed-in queriers, including those the cache refuses to save. For this, the internal `insert_querier()` methods has to be made a member to be able to use the closing gate.	2021-07-14 16:48:43 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Benny Halevy	7c7569f0ad	querier_cache: implement stop Close the _closing_gate to wait on background close of dropped queries, and close all remaining queriers. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	07f34b4a32	querier_cache: lookup_querier: close the querier before dropping it Make sure to close the dropped querier before it's destroyed. The operation is moved to the background so not to penelize the common path. A following patch will add a querier_cache::close() method that will close _closing_gate to wait on the querier close (among other things it needs to wait on :)). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	4a0abc7b9c	querier_cache: lookup_querier: define as a private method In preparation to closing the querier in the background before dropping it. With that, stats need not be passed as a parameter, but rather the _stats member can be used directly. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	8b8c721431	querier: add close method Depening on the variant _reader contents, either close the reader or unregister the inactive reader and close it. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	2f9cf01aa7	querier_cache: futurize evict api Prepare for futurizing the lower-level inactive reads api. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	978501c336	flat_mutation_reader: partition_reversing_mutation_reader: implement no-op close We don't own _source therefore do not close it. That said, we still need to make sure that the reversing reader itself is closed to calm down the check when it's destroyed. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:16:10 +03:00
Avi Kivity	913d970c64	Merge "Unify inactive readers" from Botond " Currently inactive readers are stored in two different places: * reader concurrency semaphore * querier cache With the latter registering its inactive readers with the former. This is an unnecessarily complex (and possibly surprising) setup that we want to move away from. This series solves this by moving the responsibility if storing of inactive reads solely to the reader concurrency semaphore, including all supported eviction policies. The querier cache is now only responsible for indexing queriers and maintaining relevant stats. This makes the ownership of the inactive readers much more clear, hopefully making Benny's work on introducing close() and abort() a little bit easier. Tests: unit(release, debug:v1) " * 'unify-inactive-readers/v2' of https://github.com/denesb/scylla: reader_concurrency_semaphore: store inactive readers directly querier_cache: store readers in the reader concurrency semaphore directly querier_cache: retire memory based cache eviction querier_cache: delegate expiry to the reader_concurrency_semaphore reader_concurrency_semaphore: introduce ttl for inactive reads querier_cache: use new eviction notify mechanism to maintain stats reader_concurrency_semaphore: add eviction notification facility reader_concurrency_semaphore: extract evict code into method evict()	2021-02-03 10:59:04 +02:00
Botond Dénes	cd8d10873f	querier_cache: use the reader permit for memory accounting The querier cache has a memory limit it enforces on cached queriers. For determining how much memory each querier uses, it currently uses `flat_mutation_reader::buffer_size()`. However, we now have a much more complete accounting of the memory each read consumes, in the form of the reader permit, which also happens to be handy in the queriers. So use it instead of the not very well maintained `buffer_size()`.	2020-10-06 08:22:56 +03:00
Wojciech Mitros	45215746fe	increase the maximum size of query results to 2^64 Currently, we cannot select more than 2^32 rows from a table because we are limited by types of variables containing the numbers of rows. This patch changes these types and sets new limits. The new limits take effect while selecting all rows from a table - custom limits of rows in a result stay the same (2^32-1). In classes which are being serialized and used in messaging, in order to be able to process queries originating from older nodes, the top 32 bits of new integers are optional and stay at the end of the class - if they're absent we assume they equal 0. The backward compatibility was tested by querying an older node for a paged selection, using the received paging_state with the same select statement on an upgraded node, and comparing the returned rows with the result generated for the same query by the older node, additionally checking if the paging_state returned by the upgraded node contained new fields with correct values. Also verified if the older node simply ignores the top 32 bits of the remaining rows number when handling a query with a paging_state originating from an upgraded node by generating and sending such a query to an older node and checking the paging_state in the reply(using python driver). Fixes #5101.	2020-08-03 17:32:49 +02:00
Botond Dénes	92ce39f014	query: query_class_config: use max_result_size for the max_memory_for_unlimited_query field We want to switch from using a single limit to a dual soft/hard limit. As a first step we switch the limit field of `query_class_config` to use the recently introduced type for this. As this field has a single user at the moment -- reverse queries (and not a lot of propagation) -- we update it in this same patch to use the soft/hard limit: warn on reaching the soft limit and abort on the hard limit (the previous behaviour).	2020-07-28 18:00:29 +03:00
Botond Dénes	72b8a2d147	querier: move common stuff into querier_base The querier cache expects all querier objects it stores to have certain methods. To avoid accessing these via `std::visit()` (the querier object is stored in an `std::variant`), we move all the stuff that is common to all querier types into a base class. The querier cache now accesses the members via a reference to this common base. Additionally the variant is eliminated completely and the cache entry stores an `std::unique_ptr<querier_base>` instead. Tests: unit(dev) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20200603152544.83704-1-bdenes@scylladb.com>	2020-06-03 18:45:33 +03:00
Avi Kivity	a4c44cab88	treewide: update concepts language from the Concepts TS to C++20 Seastar recently lost support for the experimental Concepts Technical Specification (TS) and gained support for C++20 concepts. Re-enable concepts in Scylla by updating our use of concepts to the C++20 standard. This change: - peels off uses of the GCC6_CONCEPT macro - removes inclusions of <seastar/gcc6-concepts.hh> - replaces function-style concepts (no longer supported) with equation-style concepts - semicolons added and removed as needed - deprecated std::is_pod replaced by recommended replacement - updates return type constraints to use concepts instead of type names (either std::same_as or std::convertible_to, with std::same_as chosen when possible) No attempt is made to improve the concepts; this is a specification update only. Message-Id: <20200531110254.2555854-1-avi@scylladb.com>	2020-06-02 09:12:21 +03:00
Botond Dénes	e678f06a5e	querier_cache: get semaphore from querier Currently the `querier_cache` is passed a semaphore during its construction and it uses this semaphore to do all the inactive reader registering/unregistering. This is inaccurate as in theory cached reads could belong to different semaphores (although currently this is not yet the case). As all queriers store a valid permit now, use this permit to obtain the semaphore the querier is associated with, and register the inactive read with this semaphore.	2020-05-28 11:34:35 +03:00
Botond Dénes	d5ebd763ff	multishard_mutation_query: pass a valid permit to shard mutation sources In preparation of a valid permit being required to be passed to all mutation sources, create a permit before creating the shard readers and pass it to the mutation source when doing so. The permit is also persisted in the `shard_mutation_querier` object when saving the reader, which is another forward looking change, to allow the querier-cache to use it to obtain the semaphore the read is actually registered with.	2020-05-28 11:34:35 +03:00
Botond Dénes	bad53c4245	querier: add reader_permit parameter and forward it to the mutation_source In preparation of a valid permit being required to be passed to all mutation sources, also add a permit to the querier object, which is then passed to the source when it is used to create a reader.	2020-05-28 11:34:35 +03:00
Botond Dénes	e778b072b1	read_command: use bool_class for is_first_page parameter The constructor of `read_command` is used both by IDL and clients in the code. However, this constructor has a parameter that is not used by IDL: `read_timestamp`. This requires that this parameter is the very last in the list and that new parameters that are used by IDL are added before it. One such new parameter was `bool is_first_page`. Adding this parameter right before the read timestamp one created a situation where the last parameter (read_timestamp) implicitly converts to the one before it (is_first_page). This means that some call sites passing `read_timestamp` were now silently converting this to `is_first_page`, effectively dropping the timestamp. This patch aims to rectify this, while also avoiding similar accidents in the future, by making `is_first_page` a `bool_class` which doesn't have any implicit convertions defined. This change does not break the ABI as `bool_class` is also sent as a `bool` on the wire. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Tests: unit(dev) Message-Id: <20200422073657.87241-1-bdenes@scylladb.com>	2020-04-22 11:01:22 +03:00
Botond Dénes	0418a74fa9	querier: consume_page(): resolve FIXME related to non-movable consumer Now that #3158 is fixed, we can move the consumer to its place after the `compaction_mutation_state::start_new_page()` call. No need to keep it as `std::unique_ptr<>`. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20200310185147.207665-1-bdenes@scylladb.com>	2020-03-24 15:28:42 +02:00
Botond Dénes	7bdeec4b00	flat_mutation_reader: make_reversing_reader(): add memory limit If the reversing requires more memory than the limit, the read is aborted. All users are updated to get a meaningful limit, from the respective table object, with the exception of tests of course.	2020-02-27 18:11:54 +02:00
Botond Dénes	091d80e8c3	flat_mutation_reader: expose reverse reader as a standalone reader Currently reverse reads just pass a flag to `flat_mutation_reader::consume()` to make the read happen in reverse. This is deceptively simple and streamlined -- while in fact behind the scenes a reversing reader is created to wrap the reader in question to reverse partitions, one-by-one. This patch makes this apparent by exposing the reversing reader via `make_reversing_reader()`. This now makes how reversing works more apparent. It also allows for more configuration to be passed to the reversing reader (in the next patches). This change is forward compatible, as in time we plan to add reversing support to the sstable layer, in which case the reversing reader will go.	2020-02-27 18:11:54 +02:00
Botond Dénes	dfc8b2fc45	treewide: replace reader_resource_tracer with reader_permit The former was never really more than a reader_permit with one additional method. Currently using it doesn't even save one from any includes. Now that readers will be using reader_permit we would have to pass down both to mutation_source. Instead get rid of reader_resource_tracker and just use reader_permit. Instead of making it a last and optional parameter that is easy to ignore, make it a first class parameter, right after schema, to signify that permits are now a prominent part of the reader API. This -- mostly mechanical -- patch essentially refactors mutation_source to ask for the reader_permit instead of reader_resource_tracking and updates all usage sites.	2020-01-28 08:13:16 +02:00
Botond Dénes	d57ab83bc8	querier_cache: add `inserted` stat Recently we have seen a case where the population stat of the cache was corrupt, either due to misaccounting or some more serious corruption. When debugging something like that it would have been useful to know how many items have been inserted to the cache. I also believe that such a counter could be useful generally as well. Refs: #4918 Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20190924083429.43038-1-bdenes@scylladb.com>	2019-09-24 10:52:49 +02:00
Botond Dénes	a41e8f0bcf	query::consume_page: move away from variadic future Require the `consumer` to return 0 or 1 value in its future. Update all downstream code. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20190731140440.57295-1-bdenes@scylladb.com>	2019-07-31 18:49:47 +03:00
Botond Dénes	ab5d717052	reader_concurrency_semaphore::inactive_read_handle: fix handle semantics That is: * make it move only; * make moved-from handles null handles; * add (public) default constructor, which constructs a null handle;	2019-02-12 16:20:51 +02:00
Botond Dénes	37f0117747	reader_concurrency_semaphore: refactor eviction mechanism As we are about to add multiple sources of evictable readers, we need a more scalable solution than a single functor being passed that opaquely evicts a reader when called. Add a generic way to register and unregister evictable (inactive) readers to the semaphore. The readers are expected to be registered when they become evictable and are expected to be unregistered when they cease to become evictable. The semaphore might evict any reader that is registered to it, when it sees fit. This also solves the problem of notifying the semaphore when new readers become evictable. Previously there was no such mechanism, and the semaphore would only evict any such new readers when a new permit was requested from it.	2018-12-04 08:51:00 +02:00
Botond Dénes	eb357a385d	flat_mutation_reader: make timeout opt-out rather than opt-in Currently timeout is opt-in, that is, all methods that even have it default it to `db::no_timeout`. This means that ensuring timeout is used where it should be is completely up to the author and the reviewrs of the code. As humans are notoriously prone to mistakes this has resulted in a very inconsistent usage of timeout, many clients of `flat_mutation_reader` passing the timeout only to some members and only on certain call sites. This is small wonder considering that some core operations like `operator()()` only recently received a timeout parameter and others like `peek()` didn't even have one until this patch. Both of these methods call `fill_buffer()` which potentially talks to the lower layers and is supposed to propagate the timeout. All this makes the `flat_mutation_reader`'s timeout effectively useless. To make order in this chaos make the timeout parameter a mandatory one on all `flat_mutation_reader` methods that need it. This ensures that humans now get a reminder from the compiler when they forget to pass the timeout. Clients can still opt-out from passing a timeout by passing `db::no_timeout` (the previous default value) but this will be now explicit and developers should think before typing it. There were suprisingly few core call sites to fix up. Where a timeout was available nearby I propagated it to be able to pass it to the reader, where I couldn't I passed `db::no_timeout`. Authors of the latter kind of code (view, streaming and repair are some of the notable examples) should maybe consider propagating down a timeout if needed. In the test code (the wast majority of the changes) I just used `db::no_timeout` everywhere. Tests: unit(release, debug) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <1edc10802d5eb23de8af28c9f48b8d3be0f1a468.1536744563.git.bdenes@scylladb.com>	2018-09-20 11:31:24 +02:00
Botond Dénes	ecb1e79bcc	querier: add shard_mutation_querier The querier to be used for saving shard readers belonging to a multishard range scan. This querier doesn't provide a `consume_page` method as it doesn't support reading from it directly. It is more of a storage to allow caching the reader and any objects it depends on.	2018-09-03 10:31:44 +03:00
Botond Dénes	07cdf766c5	querier: prepare for multi-ranges In the next patch a querier will be added that reads multiple ranges as opposed to a single range that data and mutation queriers read. To keep `querier_cache` code seamless regarding this difference change all range-matching logic to work in terms of `dht::partition_ranges_view`. This allows for cheap and seamless way of having a single code-base for the insert/lookup logic. Code actually matching ranges is updated to be able to handle both singular and multi-ranges while maintaining backward compatibility.	2018-09-03 10:31:44 +03:00
Botond Dénes	c12008b8cb	querier: split querier into separate data and mutation querier types Instead of hiding what compaction method the querier uses (and only expose it via rejecting 'can_be_used_for_page()`) make it very explicit that these are really two different queriers. This allows using different indexes for the two queriers in `querier_cache` and eliminating the possibility of picking up a querier with the wrong compaction method (read kind). This also makes it possible to add new querier type(s) that suit the multishard-query's needs without making a confusing mess of `querier` by making it a union of all querying logic. Splitting the queriers this way changes what happens when a lookup finds a querier of the wrong kind (e.g. emit_only_live::yes for an emit_only_live::no command). As opposed to dropping the found (but wrong) querier the querier will now simply not be found by the lookup. This is a result of using separate search indexes for the different mutation kinds. This change should have no practical implications. Splitting is done by making querier templated on `emit_only_live_rows`. It doesn't make sense to duplicate the entire querier as the two share 99% of the code.	2018-09-03 10:31:44 +03:00
Botond Dénes	e46251ebf6	querier: move consume_page logic into a free function In preparation of the now single querier being split into multiple more specialized ones. Make it possible for the multiple queriers sharing the same implementation. Also, the code can now be reused by outside code as well, not just queriers.	2018-09-03 10:31:44 +03:00
Botond Dénes	c53f17ddb8	querier: move all matching related logic into free functions So that they can be used for multiple querier classes easily, without inheritance. The functions are not visible from the header. Also update the comments on `querier` to w.r.t. the disappeared checking functions. Change the language to be more general. In practice these checks are never done by client code, instead they are done by the `querier_cache`.	2018-09-03 10:31:44 +03:00
Botond Dénes	43f464c52d	querier: inline querier::current_position() and make it public	2018-09-03 10:31:44 +03:00
Botond Dénes	86a61ded7d	querier: s/position/position_view/ Also treat it as a view, that is take it by value in functions, instead of reference.	2018-09-03 10:31:44 +03:00

1 2

69 Commits