scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-22 01:20:39 +00:00

Author	SHA1	Message	Date
Raphael S. Carvalho	c3c23dd1e5	multishard_mutation_query: make multi_range_reader::fill_buffer() work even after EOS if fill_buffer() is called after EOS, underlying reader will be fast forwarded to a range pointed to by an invalid iterator, so producing incorrect results. fill_buffer() is changed to return early if EOS was found, meaning that underlying reader already fast forwarded to all ranges managed by multi_range_reader. Usually, consume facilities check for EOS, before calling fill_buffer() but most reader impl check for EOS to avoid correctness issues. Let's do the same here. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20211208131423.31612-1-raphaelsc@scylladb.com>	2021-12-08 15:39:11 +02:00
Botond Dénes	5380cb0102	multishard_mutation_query: don't drop data during stateful multi-range reads When multiple ranges are passed to `multishard_{mutation,data}_query()`, it wraps the multishard reader with a multi-range one. This interferes with the disassembly of the multishard reader's buffer at the end of the page, because the multi-range reader becomes the top-level reader, denying direct access to the multishard reader itself, whose buffer is then dropped. This confuses the reading logic, causing data corruption on the next page(s). A further complication is that the multi-range reader can include data from more then one range in its buffer when filling it. To solve this, a special-purpose multi-range is introduced and used instead of the generic one, which solves both these problems by guaranteeing that: * Upon calling fill_buffer(), the entire content of the underlying multishard reader is moved to that of the top-level multi-range reader. So calling `detach_buffer()` guarantees to remove all unconsumed fragments from the top-level readers. * fill_buffer() will never mix data from more than one ranges. It will always stop on range boundaries and will only cross if the last range was consumed entirely. With this, multi-range reads finally work with reader-saving.	2021-12-03 10:45:06 +02:00
Botond Dénes	953603199e	multishard_combining_reader: reader_lifecycle_policy: allow saving read range on fast-forward The reader_lifecycle_policy API was created around the idea of shard readers (optionally) being saved and reused on the next page. To do this, the lifecycle policy has to also be able to control the lifecycle of by-reference parameters of readers: the slice and the range. This was possible from day 1, as the readers are created through the lifecycle policy, which can intercept and replace the said parameters with copies that are created in stable storage. There was one whole in the design though: fast-forwarding, which can change the range of the read, without the lifecycle policy knowing about this. In practice this results in fast-forwarded readers being saved together with the wrong range, their range reference becoming stale. The only lifecycle implementation prone to this is the one in `multishard_mutation_query.cc`, as it is the only one actually saving readers. It will fast-forward its reader when the query happens over multiple ranges. There were no problems related to this so far because no one passes more than one range to said functions, but this is incidental. This patch solves this by adding an `update_read_range()` method to the lifecycle policy, allowing the shard reader to update the read range when being fast forwarded. To allow the shard reader to also have control over the lifecycle of this range, a shared pointer is used. This control is required because when an `evictable_reader` is the top-level reader on the shard, it can invoke `create_reader()` with an edited range after `update_read_range()`, replacing the fast-forwarded-to range with a new one, yanking it out from under the feet of the evictable reader itself. By using a shared pointer here, we can ensure the range stays alive while it is the current one.	2021-12-03 10:27:44 +02:00
Botond Dénes	3210dee4a6	multishard_mutation_query: fix reverse scans The read itself has to be done with the reversed schema (query schema) but the result building has to be done with the table schema. For data queries this doesn't matter, but replicate the distinction for consistency (and because this might change).	2021-11-23 14:22:01 +02:00
Tomasz Grabiec	cc56a971e8	database, treewide: Introduce partition_slice::is_reversed() Cleanup, reduces noise. Message-Id: <20211014093001.81479-1-tgrabiec@scylladb.com>	2021-10-14 12:39:16 +03:00
Botond Dénes	42b677ef6f	querier: consume_page(): remove now unused max_size parameter	2021-09-29 12:15:48 +03:00
Botond Dénes	41facb3270	treewide: move reversing to the mutation sources Push down reversing to the mutation-sources proper, instead of doing it on the querier level. This will allow us to test reverse reads on the mutation source level. The `max_size` parameter of `consume_page()` is now unused but is not removed in this patch, it will be removed in a follow-up to reduce churn.	2021-09-29 12:15:45 +03:00
Botond Dénes	22e216563a	mutlishard_mutation_query: set max result size on used permits `08042c1688` added the query max result size to the permit but only set it for single partition queries. This patch does the same for range-scans in preparation of `query::consume_page()` not propagating max size soon.	2021-09-28 17:03:57 +03:00
Botond Dénes	922295dd8e	multishard_mutation_query: add tracepoint with compaction stats Add the content of the compaction stats introduced in the previous patch to the tracing data. This will help diagnose query performance related problems caused by tombstones.	2021-09-22 14:00:24 +03:00
Benny Halevy	4476800493	flat_mutation_reader: get rid of timeout parameter Now that the timeout is taken from the reader_permit. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 16:30:51 +03:00
Benny Halevy	9b0b13c450	reader_concurrency_semaphore: adjust reactivated reader timeout Update the reader's timeout where needed after unregistering inactive_read. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 16:30:51 +03:00
Benny Halevy	605a1e6943	multishard_mutation_query: create_reader: validate saved reader permit Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 16:30:51 +03:00
Benny Halevy	fe479aca1d	reader_permit: add timeout member To replace the timeout parameter passed to flat_mutation_reader methods. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 14:29:44 +03:00
Piotr Sarna	776ab4bcb1	multishard_mutation_query: pass exceptions without throwing In order to avoid needless throwing, exceptions are passed directly wherever possible. Two mechanisms which help with that are: 1. make_exception_future<> for futures 2. co_return coroutine::exception(...) for coroutines which return future<T> (the mechanism does not work for future<> without parameters, unfortunately)	2021-07-26 17:05:14 +02:00
Botond Dénes	7bfa40a2f1	treewide: use make_tracking_only_permit() For all those reads that don't (won't or can't) pass through admission currently.	2021-07-14 17:19:02 +03:00
Botond Dénes	426b46c4ed	mutation_reader: reader_lifecycle_policy: add obtain_reader_permit() This method is both a convenience method to obtain the permit, as well as an abstraction to allow different implementations to get creative. For example, the main implementation, the one in multishard mutation query returns the permit of the saved reader one was successful. This ensures that on a multi-paged read the same permit is used across as much pages as possible. Much more importantly it ensures the evictable reader wrapping the actual reader both use the same permit.	2021-07-14 16:48:43 +03:00
Botond Dénes	7fcf4a63c5	multishard_mutation_query: use the passed-in permit to create new reader Ensure that when the reader has to be created anew the passed-in permit is used to create it, instead of the one left over in remote-parts, which is that of the already evicted reader. This lays the groundwork to ensure the same permit is used across all pages of a read, by a future patch which creates the wrapping reader with the existing permit.	2021-07-14 16:48:43 +03:00
Botond Dénes	28c2b54875	mutation_reader: reader_lifecycle_policy: remove convenience methods These convenience methods are not used as much anymore and they are not even really necessary as the register/unregister inactive read API got streamlined a lot to the point where all of these "convenience methods" are just one-liners, which we can just inline into their few callers without loosing readability.	2021-06-16 11:29:37 +03:00
Botond Dénes	8c7447effd	mutation_reader: reader_lifecycle_policy::destroy_reader(): require to be called on native shard Currently shard_reader::close() (its caller) goes to the remote shard, copies back all fragments left there to the local shard, then calls `destroy_reader()`, which in the case of the multishard mutation query copies it all back to the native shard. This was required before because `shard_reader::stop()` (`close()`'s) predecessor) couldn't wait on `smp::submit_to()`. But close can, so we can get rid of all this back-and-forth and just call `destroy_reader()` on the shard the reader lives on, just like we do with `create_reader()`.	2021-06-16 11:29:35 +03:00
Botond Dénes	4ecf061c90	reader_lifecycle_policy implementations: fix indentation Left broken from the previous patch.	2021-06-16 11:21:38 +03:00
Botond Dénes	a7e59d3e2c	mutation_reader: reader_lifecycle_policy::destroy_reader(): de-futurize reader parameter The shard reader is now able to wait on the stopped reader and pass the already stopped reader to `destroy_reader()`, so we can de-futurize the reader parameter of said method. The shard reader was already patched to pass a ready future so adjusting the call-site is trivial. The most prominent implementation, the multishard mutation query, can now also drop its `_dismantling_gate` which was put in place so it can wait on the background stopping if readers. A consequence of this move is that handling errors that might happen during the stopping of the reader is now handled in the shard reader, not all lifecycle policy implementations.	2021-06-16 11:21:38 +03:00
Botond Dénes	ab8d2a04a5	multishard_mutation_query: destroy remote parts in the foreground Currently the foreign fields of the reader meta are destroyed in the background via the foreign pointer's destructor (with one exception). This makes the already complicated life-cycle of these parts and their dependencies even harder to reason about, especially in tests, where even things like semaphores live only within the test. This patch makes sure to destroy all these remote fields in the foreground in either `save_reader()` or `stop()`, ensuring that once `stop()` returns, everything is cleaned up.	2021-06-16 11:21:38 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Botond Dénes	ae366868fb	multishard_mutation_query: save_reader(): avoid round-trip for destroying rparts Force its destruction when saving the reader. Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210514140844.119362-1-bdenes@scylladb.com>	2021-05-18 10:07:13 +03:00
Benny Halevy	cd0991f28d	multishard_mutation_query: read_context::stop: properly close unregistered inactive_reads Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	93d6dcdbcf	multishard_mutation_query: read_context: stop: wait on unregistering inactive reads Currently unregister_inactive_read for other shards is moved to the background with nothing keep the respective reader_concurrency_semaphore around. This change runs the loop in parallel_for_each so that we don't have to serially wait on all of them but rather they can run in parallel on all shards, but all are waited on via the returned future<>. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	8421e1f61e	mutlishard_mutation_query: read_context: close: unregister all inactive reads Currently only if the reader_meta is in the saved state we unregister its inactive_read, yet it is possible that it will hold an inactive_read also in the lookup state. To cover all cases, rather than testing the reader_state, unregister if the inactive_read_handle is engaged. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	53889ef9b0	multishard_mutation_query: read_page: close reader when done Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	afa2fe0b76	multishard_mutation_query: read_page: make compaction_state first To simplify error handling for always closing the reader in this function. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	e2a767bef7	multishard_mutation_query: page_consume_result: mark constructor noexcept As it can't throw. This is needed to simplify the following patch that will always close the reader in read_page. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Benny Halevy	2c1edb1a94	mutation_reader: reader_lifecycle_policy: return future from destroy_reader So we can wait on it from to-be-introduced shard_reader::close(). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-04-25 11:35:07 +03:00
Botond Dénes	45471419d0	multishard_mutation_query: re-enable reverse queries `034cb81323` and `0f0c3be` disallowed reverse partition-range scans based on the observation that the CQL frontend disallows them, assuming that other client APIs also disallow them. As it turns out this is not true and there it at least one client API (Thrift) which does allows reverse range scans. So re-enable them. Fixes: #8211 Tests: unit(release), dtest(thrift_tests.py) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210304142249.164247-1-bdenes@scylladb.com>	2021-03-04 17:06:16 +02:00
Botond Dénes	0f0c3be63e	multishard_mutation_query: query_mutations_on_all_shards(): refuse reverse queries Refuse reverse queries just like in the new `query_data_on_all_shards()`. The reason is the same, reverse range scans are not supported on the client API level and hence they are underspecified and more importantly: not tested.	2021-03-02 07:53:53 +02:00
Botond Dénes	034cb81323	multishard_mutation_query: add query_data_on_all_shards() A data query variant of the existing `query_mutations_on_all_shards()`. This variant builds a `query::result`, instead of `reconcilable_result`. This is actually the result format coordinators want when executing range scans, the reason for using the reconcilable result for these queries is historic, and it just introduces an unnecessary intermediate format. This new method allows the storage proxy to skip this intermediate format and the associated conversion to `query::result`, just like we do for single partition queries. Reverse queries are refused because they are not supported on the client API (CQL) level anyway and hence it is unspecified how they should work and more importantly: they are not tested.	2021-03-02 07:53:53 +02:00
Botond Dénes	f19ab5cff1	multishard_mutation_query: generalize query code w.r.t. the result builder used We want to add support to building `query::result` directly and reuse the code path we use to build reconcilable result currently for it. So templatize said code path on the result builder used. Since the different result builders don't have a source level compatible interface an adaptor class is used.	2021-03-02 07:53:53 +02:00
Botond Dénes	bddb0d35d6	multishard_mutation_query: query_mutations_on_all_shards(): extract logic into new method In the next patches we are going to generalize the query logic w.r.t. the result builder used, so query_mutations_on_all_shards() will be just a facade parametrizing the actual query code with the right result builder.	2021-03-02 07:53:53 +02:00
Botond Dénes	b0b620b501	multishard_mutation_query: query_mutations_on_all_shards(): convert to coroutine In preparation to generalizing it w.r.t. the result builder used. This change will be much simpler with the coroutine code.	2021-03-02 07:53:53 +02:00
Botond Dénes	5d85615698	multishar_mutation_query: do_query_mutations(): convert to coroutine In preparation to generalizing it w.r.t. the result builder used. This change will be much simpler with the coroutine code.	2021-03-02 07:53:53 +02:00
Botond Dénes	8138bdb434	multishard_mutation_query: read_page(): convert to coroutine In preparation to generalizing it w.r.t. the result builder used. This change will be much simpler with the coroutine code.	2021-03-02 07:53:53 +02:00
Botond Dénes	29195f67f1	multishard_mutation_query: extract page reading logic into separate method The block of code moved also coincides with the scope in which the reader has to be alive, making the code more clear.	2021-03-02 07:53:53 +02:00
Benny Halevy	d565e3fb57	reader_lifecycle_policy: retire low level try_resume method The caller can now just call sem.unregister_inactive_read(irh) directly. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-02-08 20:32:40 +02:00
Botond Dénes	226088d12e	mutation_reader: reader_lifecycle_policy::stopped_reader: drop pending_next_partition flag Its not used anymore.	2021-01-22 16:18:59 +02:00
Benny Halevy	29002e3b48	flat_mutation_reader: return future from next_partition To allow it to asynchronously close underlying readers on next_partition(). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-01-13 17:35:07 +02:00
Benny Halevy	ff931c2ecc	multishard_mutation_query: read_context: save_reader: destroy reader_meta from the calling shard The reader_meta in _readers[shard] is created on shard 0 and must be destroyed on it as well. A following patch changes next_partition() to return a future<> thus it introduces a continuation that requires access to `rm`. We cannot move it down to the conuation safely, since it will be wrongly destroyed in the invoked shard, so use do_with to hold it in the scope of the calling shard until the invoked function completes. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-01-13 17:35:07 +02:00
Tomasz Grabiec	ba42e7fcc5	multishard_mutation_query: Propagate mutation_reader::forwarding flag Otherwise all readers will be created with the default forwarding::yes. This inhibits some optimizations (e.g. results in more sstable read-ahead). It will also be problematic when we introduce mutation sources which don't support forwarding::yes in the future. Message-Id: <1604065206-3034-1-git-send-email-tgrabiec@scylladb.com>	2020-11-02 15:24:36 +02:00
Botond Dénes	ff623e70b3	reader_concurrency_semaphore: name permits Require a schema and an operation name to be given to each permit when created. The schema is of the table the read is executed against, and the operation name, which is some name identifying the operation the permit is part of. Ideally this should be different for each site the permit is created at, to be able to discern not only different kind of reads, but different code paths the read took. As not all read can be associated with one schema, the schema is allowed to be null. The name will be used for debugging purposes, both for coredump debugging and runtime logging of permit-related diagnostics.	2020-10-13 12:32:13 +03:00
Botond Dénes	307cdf1e0d	multishard_combining_reader: reader_lifecycle_policy: add permit param to create_reader() Allow the evictable reader managing the underlying reader to pass its own permit to it when creating it, making sure they share the same permit. Note that the two parts can still end up using different permits, when the underlying reader is kept alive between two pages of a paged read and thus keeps using the permit received on the previous page. Also adjust the `reader_context` in multishard_mutation_query.cc to use the passed-in permit instead of creating a new one when creating a new reader.	2020-10-12 15:56:56 +03:00
Botond Dénes	e09ab09fff	multishard_combining_reader: add permit parameter Don't create an own permit, take one as a parameter, like all other readers do, so the permit can be provided by the higher layer, making sure all parts of the logical read use the same permit.	2020-10-12 15:56:56 +03:00
Botond Dénes	256140a033	mutation_fragment: memory_usage(): remove unused schema parameter The memory usage is now maintained and updated on each change to the mutation fragment, so it needs not be recalculated on a call to `memory_usage()`, hence the schema parameter is unused and can be removed.	2020-09-28 11:27:47 +03:00
Botond Dénes	6ca0464af5	mutation_fragment: add schema and permit We want to start tracking the memory consumption of mutation fragments. For this we need schema and permit during construction, and on each modification, so the memory consumption can be recalculated and pass to the permit. In this patch we just add the new parameters and go through the insane churn of updating all call sites. They will be used in the next patch.	2020-09-28 11:27:23 +03:00

1 2 3

105 Commits