scylladb

Author	SHA1	Message	Date
Tomasz Grabiec	cf034c1891	schema_mutations: Make it a monoid by defining appropriate += operator	2022-08-26 16:48:15 +02:00
Benny Halevy	add612bc52	mutation: consume_clustering_fragments: get rid of reversed_range_tombstones; Reversing the whole range_tombstone_list into reversed_range_tombstones is inefficient and can lead to reactor stalls with a large number of range tombstones. Instead, iterator over the range_tombsotne_list in reverse direction and reverse each range_tombstone as we go, keeping the result in the optional cookie.reversed_rt member. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-22 19:42:52 +03:00
Benny Halevy	8f0376bba1	mutation: consume_clustering_fragments: reindent Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-17 16:45:20 +03:00
Benny Halevy	749371c2b0	mutation: consume_clustering_fragments: shuffle emit_rt logic around To prepare for a following patch that will get rid of the cookie.reversed_range_tombstones list. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-17 16:44:23 +03:00
Benny Halevy	0e21073c38	mutation: consume, consume_gently: simplify partition_start logic Concentrate the logic in a single (!cookie.partition_start_consumed) block Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-17 15:49:12 +03:00
Benny Halevy	d661b84d51	mutation: consume_clustering_fragments: pass iterators to mutation_consume_cookie ctor and set crs and rts only in the block where they are used, so we can get rid of reversed_range_tombstones. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-17 15:30:36 +03:00
Benny Halevy	f1b7a1a6f1	mutation: consume_clustering_fragments: keep the reversed schema in cookie Rather than reversing the schema on every call just keep the potentially reversed schema in cookie. Othwerwise, cookie.schema was write only. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-17 15:30:36 +03:00
Benny Halevy	a230ea0019	mutation: clustering_iterators: get rid of current_rt It is currently write-only. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-17 15:30:16 +03:00
Benny Halevy	257d74bb34	schema, everywhere: define and use table_id as a strong type Define table_id as a distinct utils::tagged_uuid modeled after raft tagged_id, so it can be differentiated from other uuid-class types, in particular from table_schema_version. Fixes #11207 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 08:09:41 +03:00
Mikołaj Sielużycki	09da47d87e	mutation: Ignore dummy rows when consuming clustering fragments consume_clustering_fragments already ignores dummy rows, but does it in the wrong place. Currently they're ignored after comparing them with range tombstones. This change skips them before any useful work is done with them. Consider a simplified mutation reversal scenario scenario (ckp is clustering key prefix, -1, 0, 1 are bound_weights): schema_ptr s = schema_builder{"ks", "cf"} .with_column("pk", bytes_type, column_kind::partition_key) .with_column("ck1", bytes_type, column_kind::clustering_key) .build(); Range tombstones: range_tombstone rt1{ckp{}, bound_kind::incl_start, ckp{1}, bound_kind::incl_end, tombstone{ts + 0, tp}}; range_tombstone rt2{ckp{1}, bound_kind::excl_start, ckp{}, bound_kind::incl_end, tombstone{ts + 1, tp}}; Input range tombstone positions: {clustered, ckp{}, before} {clustered, ckp{1}, after} Clustering rows: {clustered, ckp{2}, equal} {clustered, ckp{}, after} // dummy row During reversal, clustering rows are read backwards, and reversed range tombstone positions are read forwards (because the range tombstones are reversed and applied backwards). Position of rows is not reversed, as regular rows always have equal positions (which does not hold for dummy rows, which causes the problem in this case). The read order in the example above is: Reversed range tombstone positions: 1: {clustered, ckp{}, before} 2: {clustered, ckp{1}, before} Clustering rows read backwards: 3: {clustered, ckp{}, after} // dummy row 4: {clustered, ckp{2}, equal} Then we effectively do the merge part of merge sort, trying to put all fragments in order according to their positions from the two lists above. However, the dummy row is used in the comparison, and it compares to be gt each of the reversed range tombstone positions. Then we try to emit the clustering row, but only at that point we notice it's dummy and should be skipped. Subsequent row with ckp{2} is compared to the last used range tombstone position and the fragments are out of order (in reversed schema, ckp{2} should come before ckp{1}). The solution is to move the logic skipping the dummy clustering rows to the beginning of the loop, so they can be ignored before they're used.	2022-07-27 09:32:56 +02:00
David Garcia	5adb5875f1	Add redirections	2022-06-28 09:39:14 +01:00
Tomasz Grabiec	02c92d5ea2	test: mutation: Compare against compacted mutations Memtables and cache will compact eagerly, so tests should not expect readers to produce exact mutations written, only those which are equivalant after applying copmaction.	2022-06-15 11:30:01 +02:00
Avi Kivity	5129280f45	Revert "Merge 'memtable, cache: Eagerly compact data with tombstones' from Tomasz Grabiec" This reverts commit `e0670f0bb5`, reversing changes made to `605ee74c39`. It causes failures in debug mode in database_test.test_database_with_data_in_sstables_is_a_mutation_source_plain, though with low probability. Fixes #10780 Reopens #652.	2022-06-14 18:06:22 +03:00
Tomasz Grabiec	374234cf76	test: mutation: Compare against compacted mutations Memtables and cache will compact eagerly, so tests should not expect readers to produce exact mutations written, only those which are equivalant after applying copmaction.	2022-06-06 19:25:40 +02:00
Michael Livshin	029508b77c	flat_mutation_reader ist tot Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-05-31 23:42:34 +03:00
Benny Halevy	ca1b616092	mutation: add consume_gently Allow yielding when consuming a mutation, and use in to_data_query_result. Fixes #10038 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-05-05 13:32:25 +03:00
Benny Halevy	e88871f4ec	replica: database: move shard_of implementation to mutation layer We don't need the database to determine the shard of the mutation, only its schema. So move the implementation to the respecive definitions of mutation and frozen_mutation. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #10430	2022-04-27 14:40:24 +03:00
Botond Dénes	fcda35d08e	mutation: migrate consume() to v2 The underlying mutation format is still v1, so consume() ends up doing an online conversion. This allows converting all downstream code to v2, leaving the conversion close to the code that is yet to be migrated to v2 native: the mutation itself.	2022-02-21 12:27:55 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Botond Dénes	e8ca07abed	mutation: consume(): make it pausable/resumable To avoid stalls or overconsumption for consumers which have a limit on how much they want to consume in one go, the mutation::consume() is made pausable/resumable. This happens via a cookie which is now returned as part of the returned result, and which can be passed to a later consume call to resume the previous one.	2022-01-05 09:06:16 +02:00
Botond Dénes	f1391d5c27	mutation: consume(): restructure clustering iterator initialization Instead of having a branch per each value of `consume_in_reverse`, have just two ifs with two branches each for clustering rows and range tombstones respectively, to facilitate further patching.	2022-01-05 07:29:36 +02:00
Botond Dénes	1d6896c14f	mutation: introduce reverse() Which reverses the mutation as if it was created with a schema with reversed clustering order.	2021-09-09 15:42:15 +03:00
Botond Dénes	16b9d19e50	mutation: make copy constructor compatible with mutation_opt Currently `_data` is assumed to be engaged by the copy constructor which is not necessarily the case with `mutation_opt` objects (which is an `optimized_optional<mutation>`). Fix this by only copying `_data` if non-null.	2021-09-09 15:42:15 +03:00
Botond Dénes	0af5a8add0	mutation: consume(): add native reverse order The existing consume_in_reverse::yes is renamed to consume_in_reverse::legacy_half_reverse and consume_in_reverse::yes now means native reverse order. This is because we expect the legacy order to die out at one point and when that happens we can just remove that ugly third option and will be left with yes and no as before.	2021-09-09 14:18:32 +03:00
Botond Dénes	38ef80d4d2	mutation: consume(): don't include dummy rows	2021-09-09 14:18:32 +03:00
Pavel Emelyanov	d6af441eaa	range_tombstone: Move linkage into range_tombstone_entry Now it's time to remove the boost set's hook from the range_tombstone and keep it wrapped into another class if the r._tombstone's location is the range_tombstone_list. Also the added previously .tombstone() getters and the _entry alias can be removed -- all the code can work with the new class. Two places in the code that made use of without_link{} move-constructor are patched to get the range_tombstone part from the respective _entry with the same result. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-03 19:34:45 +03:00
Pavel Emelyanov	87ce46d1c6	mutation: Use range_tombstone_list's iterators The consume_clustering_fragments declares several auxiliary symbols to work with rows' and range-tombstones' iterators. For the range tombstones it relies on what container is declared inside the range tombstone itself. Soon the container declaration will move from range_tombstone class into a new entity and this place should be prepared for that. The better place to get iterator types from is the range-tombstones container itself. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-03 12:56:13 +03:00
Avi Kivity	acf8da2bce	Merge "flat_mutation_reader: keep timeout in permit" from Benny " This series moves the timeout parameter, that is passed to most f_m_r methods, into the reader_permit. This eliminates the need to pass the timeout around, as it's taken from the permit when needed. The permit timeout is updated in certain cases when the permit/reader is paused and retrieved later on for reuse. Following are perf_simple_query results showing ~1% reduction in insns/op and corresponding increase in tps. $ build/release/test/perf/perf_simple_query -c 1 --operations-per-shard 1000000 --task-quota-ms 10 Before: 102500.38 tps ( 75.1 allocs/op, 12.1 tasks/op, 45620 insns/op) After: 103957.53 tps ( 75.1 allocs/op, 12.1 tasks/op, 45372 insns/op) Test: unit(dev) DTest: repair_additional_test.py:RepairAdditionalTest.repair_abort_test (release) materialized_views_test.py:TestMaterializedViews.remove_node_during_mv_insert_3_nodes_test (release) materialized_views_test.py:InterruptBuildProcess.interrupt_build_process_with_resharding_half_to_max_test (release) migration_test.py:TTLWithMigrate.big_table_with_ttls_test (release) " * tag 'reader_permit-timeout-v6' of github.com:bhalevy/scylla: flat_mutation_reader: get rid of timeout parameter reader_concurrency_semaphore: use permit timeout for admission reader_concurrency_semaphore: adjust reactivated reader timeout multishard_mutation_query: create_reader: validate saved reader permit repair: row_level: read_mutation_fragment: set reader timeout flat_mutation_reader: maybe_timed_out: use permit timeout test: sstable_datafile_test: add sstable_reader_with_timeout reader_permit: add timeout member	2021-08-25 17:51:10 +03:00
Pavel Emelyanov	b012040a76	mutation: Keep range tombstone in tree when consuming Current code std::move()-s the range tombstone into consumer thus moving the tombstone's linkage to the containing list as well. As the result the orignal range tombstone itself leaks as it leaves the tree and cannot be reached on .clear(). Another danger is that the iterator pointing to the tombstone becomes invalid while it's then ++-ed to advance to the next entry. The immediate fix is to keep the tombstone linked to the list while moving. fixes: #9207 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20210825100834.3216-1-xemul@scylladb.com>	2021-08-25 13:25:18 +03:00
Benny Halevy	4476800493	flat_mutation_reader: get rid of timeout parameter Now that the timeout is taken from the reader_permit. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 16:30:51 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Botond Dénes	46b795b5fd	mutation: consume(): add reverse mode `mutation::consume()` is used by range scans to convert the immediate `reconcilable_result` to the final `query::result` format. When the range scan is in reverse, `mutation::consume()` has to feed the clustering fragments to the consumer in reverse order, but currently `mutation::consume()` always uses the natural order, breaking reverse range scans. This patch fixes this by adding a `consume_in_reverse` parameter to `mutation::consume()`, and consequently support for consuming clustering fragments in reverse order. Fixes: #8000 Tests: unit(release, debug), dtest(thrift_tests.py:TestMutations.test_get_range_slice) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210203081659.622424-1-bdenes@scylladb.com>	2021-02-03 11:00:47 +02:00
Botond Dénes	9c96d74b72	mutation: remove now unused query() and query_compacted()	2021-01-22 15:36:37 +02:00
Botond Dénes	d0c5f550a9	mutation: add consume() This consume method accepts a `FlattenedConsumer`, the same one that the name-sake `flat_mutation_reader::consume()` does. Indeed the main purpose of this method is to allow using the standard query result building stack with a mutation, the same way said stack is used with mutation readers currently. This will allow us to replace the parallel query result building code that currently exists in the `mutation::query()` and friends, with the standard one.	2021-01-22 15:27:48 +02:00
Wojciech Mitros	45215746fe	increase the maximum size of query results to 2^64 Currently, we cannot select more than 2^32 rows from a table because we are limited by types of variables containing the numbers of rows. This patch changes these types and sets new limits. The new limits take effect while selecting all rows from a table - custom limits of rows in a result stay the same (2^32-1). In classes which are being serialized and used in messaging, in order to be able to process queries originating from older nodes, the top 32 bits of new integers are optional and stay at the end of the class - if they're absent we assume they equal 0. The backward compatibility was tested by querying an older node for a paged selection, using the received paging_state with the same select statement on an upgraded node, and comparing the returned rows with the result generated for the same query by the older node, additionally checking if the paging_state returned by the upgraded node contained new fields with correct values. Also verified if the older node simply ignores the top 32 bits of the remaining rows number when handling a query with a paging_state originating from an upgraded node by generating and sending such a query to an older node and checking the paging_state in the reply(using python driver). Fixes #5101.	2020-08-03 17:32:49 +02:00
Botond Dénes	6660a5df51	result_memory_accounter: remove default constructor If somebody wants to bypass proper memory accounting they should at the very least be forced to consider if that is indeed wise and think a second about the limit they want to apply.	2020-07-28 18:00:29 +03:00
Pavel Emelyanov	4fa12f2fb8	header: De-bloat schema.hh The header sits in many other headers, but there's a handy schema_fwd.hh that's tiny and contains needed declarations for other headers. So replace shema.hh with schema_fwd.hh in most of the headers (and remove completely from some). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20200303102050.18462-1-xemul@scylladb.com>	2020-03-03 11:34:00 +01:00
Konstantin Osipov	e8c13efb41	lwt: move mutation hashers to mutation.hh Prepare mutation hashers for reuse in CAS implementation. Message-Id: <20190930202409.40561-2-kostja@scylladb.com>	2019-10-01 19:49:31 +02:00
Botond Dénes	eb357a385d	flat_mutation_reader: make timeout opt-out rather than opt-in Currently timeout is opt-in, that is, all methods that even have it default it to `db::no_timeout`. This means that ensuring timeout is used where it should be is completely up to the author and the reviewrs of the code. As humans are notoriously prone to mistakes this has resulted in a very inconsistent usage of timeout, many clients of `flat_mutation_reader` passing the timeout only to some members and only on certain call sites. This is small wonder considering that some core operations like `operator()()` only recently received a timeout parameter and others like `peek()` didn't even have one until this patch. Both of these methods call `fill_buffer()` which potentially talks to the lower layers and is supposed to propagate the timeout. All this makes the `flat_mutation_reader`'s timeout effectively useless. To make order in this chaos make the timeout parameter a mandatory one on all `flat_mutation_reader` methods that need it. This ensures that humans now get a reminder from the compiler when they forget to pass the timeout. Clients can still opt-out from passing a timeout by passing `db::no_timeout` (the previous default value) but this will be now explicit and developers should think before typing it. There were suprisingly few core call sites to fix up. Where a timeout was available nearby I propagated it to be able to pass it to the reader, where I couldn't I passed `db::no_timeout`. Authors of the latter kind of code (view, streaming and repair are some of the notable examples) should maybe consider propagating down a timeout if needed. In the test code (the wast majority of the changes) I just used `db::no_timeout` everywhere. Tests: unit(release, debug) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <1edc10802d5eb23de8af28c9f48b8d3be0f1a468.1536744563.git.bdenes@scylladb.com>	2018-09-20 11:31:24 +02:00
Duarte Nunes	12507fb9ce	keys: Replace feed_hash() member function with appending_hash Replace the feed_hash() member function of partition_key and clustering_key_prefix with the specialization of appending_hash, so that we can use the general feed_hash() function. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-02-01 00:22:50 +00:00
Duarte Nunes	6b4b429883	query-result: Introduce class result_options Introduce class result_options to carry result options through the request pipeline, which at this point mean the result type and the digest algorithm. This class allows us to encapsulate the concrete digest algorithm to use. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-02-01 00:22:50 +00:00
Piotr Jastrzebski	96c97ad1db	Rename streamed_mutation* files to mutation_fragment* Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-01-24 20:56:49 +01:00
Piotr Jastrzebski	d9cbb9fedc	Delete unused mutation_from_streamed_mutation(streamed_mutation_opt) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-01-24 20:56:49 +01:00
Piotr Jastrzebski	759271f866	Delete unused mutation_from_streamed_mutation(streamed_mutation&) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-01-24 20:56:49 +01:00
José Guilherme Vanz	380bc0aa0d	Swap arguments order of mutation constructor Swap arguments in the mutation constructor keeping the same standard from the constructor variants. Refs #3084 Signed-off-by: José Guilherme Vanz <guilherme.sft@gmail.com> Message-Id: <20180120000154.3823-1-guilherme.sft@gmail.com>	2018-01-21 12:58:42 +02:00
Duarte Nunes	1374f898b9	Merge seastar upstream Class optimized_optional was moved into seastar, and its usage simplified so move_and_disengage() is replaced in favour of std::exchange(_, { }). * seastar adaca37...b0f5591 (9): > Merge "core: Introduce cancellation mechanism" from Duarte > Fix Seastar build that no longer builds with --enable-dpdk after the recent commit fd87ea2 > noncopyable_function: support function objects whose move constructors throw > Adding new hardware options to new config format, using new config format for dpdk device > Fix check for Boost version during pre-build configuration. > variant_utils: add variant_visitor constructor for C++17 mode > Merge "Allows json object to be stream to an" from Amnon > Merge 'Default to C++17' from Avi > Add const version of subscript operator to circular_buffer Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20171228112126.18142-1-duarte@scylladb.com>	2017-12-28 13:24:18 +02:00
Piotr Jastrzebski	570703a169	read_mutation_from_flat_mutation_reader: don't take schema_ptr Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-12-21 11:47:07 +01:00
Piotr Jastrzebski	4b58a05053	Introduce read_mutation_from_flat_mutation_reader This helper method reads a single mutation from a flat_mutation_reader. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-11-08 14:26:10 +01:00
Tomasz Grabiec	749f5770df	mutation: Introduce apply(mutation_fragment)	2017-11-02 12:16:17 +01:00
Tomasz Grabiec	1d5d5e26a2	mutation: Introduce sliced()	2017-06-24 18:06:11 +02:00

1 2

96 Commits