scylladb

Author	SHA1	Message	Date
Benny Halevy	e88871f4ec	replica: database: move shard_of implementation to mutation layer We don't need the database to determine the shard of the mutation, only its schema. So move the implementation to the respecive definitions of mutation and frozen_mutation. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #10430	2022-04-27 14:40:24 +03:00
Botond Dénes	fcda35d08e	mutation: migrate consume() to v2 The underlying mutation format is still v1, so consume() ends up doing an online conversion. This allows converting all downstream code to v2, leaving the conversion close to the code that is yet to be migrated to v2 native: the mutation itself.	2022-02-21 12:27:55 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Botond Dénes	e8ca07abed	mutation: consume(): make it pausable/resumable To avoid stalls or overconsumption for consumers which have a limit on how much they want to consume in one go, the mutation::consume() is made pausable/resumable. This happens via a cookie which is now returned as part of the returned result, and which can be passed to a later consume call to resume the previous one.	2022-01-05 09:06:16 +02:00
Botond Dénes	f1391d5c27	mutation: consume(): restructure clustering iterator initialization Instead of having a branch per each value of `consume_in_reverse`, have just two ifs with two branches each for clustering rows and range tombstones respectively, to facilitate further patching.	2022-01-05 07:29:36 +02:00
Botond Dénes	1d6896c14f	mutation: introduce reverse() Which reverses the mutation as if it was created with a schema with reversed clustering order.	2021-09-09 15:42:15 +03:00
Botond Dénes	16b9d19e50	mutation: make copy constructor compatible with mutation_opt Currently `_data` is assumed to be engaged by the copy constructor which is not necessarily the case with `mutation_opt` objects (which is an `optimized_optional<mutation>`). Fix this by only copying `_data` if non-null.	2021-09-09 15:42:15 +03:00
Botond Dénes	0af5a8add0	mutation: consume(): add native reverse order The existing consume_in_reverse::yes is renamed to consume_in_reverse::legacy_half_reverse and consume_in_reverse::yes now means native reverse order. This is because we expect the legacy order to die out at one point and when that happens we can just remove that ugly third option and will be left with yes and no as before.	2021-09-09 14:18:32 +03:00
Botond Dénes	38ef80d4d2	mutation: consume(): don't include dummy rows	2021-09-09 14:18:32 +03:00
Pavel Emelyanov	d6af441eaa	range_tombstone: Move linkage into range_tombstone_entry Now it's time to remove the boost set's hook from the range_tombstone and keep it wrapped into another class if the r._tombstone's location is the range_tombstone_list. Also the added previously .tombstone() getters and the _entry alias can be removed -- all the code can work with the new class. Two places in the code that made use of without_link{} move-constructor are patched to get the range_tombstone part from the respective _entry with the same result. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-03 19:34:45 +03:00
Pavel Emelyanov	87ce46d1c6	mutation: Use range_tombstone_list's iterators The consume_clustering_fragments declares several auxiliary symbols to work with rows' and range-tombstones' iterators. For the range tombstones it relies on what container is declared inside the range tombstone itself. Soon the container declaration will move from range_tombstone class into a new entity and this place should be prepared for that. The better place to get iterator types from is the range-tombstones container itself. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-03 12:56:13 +03:00
Avi Kivity	acf8da2bce	Merge "flat_mutation_reader: keep timeout in permit" from Benny " This series moves the timeout parameter, that is passed to most f_m_r methods, into the reader_permit. This eliminates the need to pass the timeout around, as it's taken from the permit when needed. The permit timeout is updated in certain cases when the permit/reader is paused and retrieved later on for reuse. Following are perf_simple_query results showing ~1% reduction in insns/op and corresponding increase in tps. $ build/release/test/perf/perf_simple_query -c 1 --operations-per-shard 1000000 --task-quota-ms 10 Before: 102500.38 tps ( 75.1 allocs/op, 12.1 tasks/op, 45620 insns/op) After: 103957.53 tps ( 75.1 allocs/op, 12.1 tasks/op, 45372 insns/op) Test: unit(dev) DTest: repair_additional_test.py:RepairAdditionalTest.repair_abort_test (release) materialized_views_test.py:TestMaterializedViews.remove_node_during_mv_insert_3_nodes_test (release) materialized_views_test.py:InterruptBuildProcess.interrupt_build_process_with_resharding_half_to_max_test (release) migration_test.py:TTLWithMigrate.big_table_with_ttls_test (release) " * tag 'reader_permit-timeout-v6' of github.com:bhalevy/scylla: flat_mutation_reader: get rid of timeout parameter reader_concurrency_semaphore: use permit timeout for admission reader_concurrency_semaphore: adjust reactivated reader timeout multishard_mutation_query: create_reader: validate saved reader permit repair: row_level: read_mutation_fragment: set reader timeout flat_mutation_reader: maybe_timed_out: use permit timeout test: sstable_datafile_test: add sstable_reader_with_timeout reader_permit: add timeout member	2021-08-25 17:51:10 +03:00
Pavel Emelyanov	b012040a76	mutation: Keep range tombstone in tree when consuming Current code std::move()-s the range tombstone into consumer thus moving the tombstone's linkage to the containing list as well. As the result the orignal range tombstone itself leaks as it leaves the tree and cannot be reached on .clear(). Another danger is that the iterator pointing to the tombstone becomes invalid while it's then ++-ed to advance to the next entry. The immediate fix is to keep the tombstone linked to the list while moving. fixes: #9207 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20210825100834.3216-1-xemul@scylladb.com>	2021-08-25 13:25:18 +03:00
Benny Halevy	4476800493	flat_mutation_reader: get rid of timeout parameter Now that the timeout is taken from the reader_permit. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2021-08-24 16:30:51 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Botond Dénes	46b795b5fd	mutation: consume(): add reverse mode `mutation::consume()` is used by range scans to convert the immediate `reconcilable_result` to the final `query::result` format. When the range scan is in reverse, `mutation::consume()` has to feed the clustering fragments to the consumer in reverse order, but currently `mutation::consume()` always uses the natural order, breaking reverse range scans. This patch fixes this by adding a `consume_in_reverse` parameter to `mutation::consume()`, and consequently support for consuming clustering fragments in reverse order. Fixes: #8000 Tests: unit(release, debug), dtest(thrift_tests.py:TestMutations.test_get_range_slice) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210203081659.622424-1-bdenes@scylladb.com>	2021-02-03 11:00:47 +02:00
Botond Dénes	9c96d74b72	mutation: remove now unused query() and query_compacted()	2021-01-22 15:36:37 +02:00
Botond Dénes	d0c5f550a9	mutation: add consume() This consume method accepts a `FlattenedConsumer`, the same one that the name-sake `flat_mutation_reader::consume()` does. Indeed the main purpose of this method is to allow using the standard query result building stack with a mutation, the same way said stack is used with mutation readers currently. This will allow us to replace the parallel query result building code that currently exists in the `mutation::query()` and friends, with the standard one.	2021-01-22 15:27:48 +02:00
Wojciech Mitros	45215746fe	increase the maximum size of query results to 2^64 Currently, we cannot select more than 2^32 rows from a table because we are limited by types of variables containing the numbers of rows. This patch changes these types and sets new limits. The new limits take effect while selecting all rows from a table - custom limits of rows in a result stay the same (2^32-1). In classes which are being serialized and used in messaging, in order to be able to process queries originating from older nodes, the top 32 bits of new integers are optional and stay at the end of the class - if they're absent we assume they equal 0. The backward compatibility was tested by querying an older node for a paged selection, using the received paging_state with the same select statement on an upgraded node, and comparing the returned rows with the result generated for the same query by the older node, additionally checking if the paging_state returned by the upgraded node contained new fields with correct values. Also verified if the older node simply ignores the top 32 bits of the remaining rows number when handling a query with a paging_state originating from an upgraded node by generating and sending such a query to an older node and checking the paging_state in the reply(using python driver). Fixes #5101.	2020-08-03 17:32:49 +02:00
Botond Dénes	6660a5df51	result_memory_accounter: remove default constructor If somebody wants to bypass proper memory accounting they should at the very least be forced to consider if that is indeed wise and think a second about the limit they want to apply.	2020-07-28 18:00:29 +03:00
Pavel Emelyanov	4fa12f2fb8	header: De-bloat schema.hh The header sits in many other headers, but there's a handy schema_fwd.hh that's tiny and contains needed declarations for other headers. So replace shema.hh with schema_fwd.hh in most of the headers (and remove completely from some). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20200303102050.18462-1-xemul@scylladb.com>	2020-03-03 11:34:00 +01:00
Konstantin Osipov	e8c13efb41	lwt: move mutation hashers to mutation.hh Prepare mutation hashers for reuse in CAS implementation. Message-Id: <20190930202409.40561-2-kostja@scylladb.com>	2019-10-01 19:49:31 +02:00
Botond Dénes	eb357a385d	flat_mutation_reader: make timeout opt-out rather than opt-in Currently timeout is opt-in, that is, all methods that even have it default it to `db::no_timeout`. This means that ensuring timeout is used where it should be is completely up to the author and the reviewrs of the code. As humans are notoriously prone to mistakes this has resulted in a very inconsistent usage of timeout, many clients of `flat_mutation_reader` passing the timeout only to some members and only on certain call sites. This is small wonder considering that some core operations like `operator()()` only recently received a timeout parameter and others like `peek()` didn't even have one until this patch. Both of these methods call `fill_buffer()` which potentially talks to the lower layers and is supposed to propagate the timeout. All this makes the `flat_mutation_reader`'s timeout effectively useless. To make order in this chaos make the timeout parameter a mandatory one on all `flat_mutation_reader` methods that need it. This ensures that humans now get a reminder from the compiler when they forget to pass the timeout. Clients can still opt-out from passing a timeout by passing `db::no_timeout` (the previous default value) but this will be now explicit and developers should think before typing it. There were suprisingly few core call sites to fix up. Where a timeout was available nearby I propagated it to be able to pass it to the reader, where I couldn't I passed `db::no_timeout`. Authors of the latter kind of code (view, streaming and repair are some of the notable examples) should maybe consider propagating down a timeout if needed. In the test code (the wast majority of the changes) I just used `db::no_timeout` everywhere. Tests: unit(release, debug) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <1edc10802d5eb23de8af28c9f48b8d3be0f1a468.1536744563.git.bdenes@scylladb.com>	2018-09-20 11:31:24 +02:00
Duarte Nunes	12507fb9ce	keys: Replace feed_hash() member function with appending_hash Replace the feed_hash() member function of partition_key and clustering_key_prefix with the specialization of appending_hash, so that we can use the general feed_hash() function. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-02-01 00:22:50 +00:00
Duarte Nunes	6b4b429883	query-result: Introduce class result_options Introduce class result_options to carry result options through the request pipeline, which at this point mean the result type and the digest algorithm. This class allows us to encapsulate the concrete digest algorithm to use. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-02-01 00:22:50 +00:00
Piotr Jastrzebski	96c97ad1db	Rename streamed_mutation* files to mutation_fragment* Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-01-24 20:56:49 +01:00
Piotr Jastrzebski	d9cbb9fedc	Delete unused mutation_from_streamed_mutation(streamed_mutation_opt) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-01-24 20:56:49 +01:00
Piotr Jastrzebski	759271f866	Delete unused mutation_from_streamed_mutation(streamed_mutation&) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-01-24 20:56:49 +01:00
José Guilherme Vanz	380bc0aa0d	Swap arguments order of mutation constructor Swap arguments in the mutation constructor keeping the same standard from the constructor variants. Refs #3084 Signed-off-by: José Guilherme Vanz <guilherme.sft@gmail.com> Message-Id: <20180120000154.3823-1-guilherme.sft@gmail.com>	2018-01-21 12:58:42 +02:00
Duarte Nunes	1374f898b9	Merge seastar upstream Class optimized_optional was moved into seastar, and its usage simplified so move_and_disengage() is replaced in favour of std::exchange(_, { }). * seastar adaca37...b0f5591 (9): > Merge "core: Introduce cancellation mechanism" from Duarte > Fix Seastar build that no longer builds with --enable-dpdk after the recent commit fd87ea2 > noncopyable_function: support function objects whose move constructors throw > Adding new hardware options to new config format, using new config format for dpdk device > Fix check for Boost version during pre-build configuration. > variant_utils: add variant_visitor constructor for C++17 mode > Merge "Allows json object to be stream to an" from Amnon > Merge 'Default to C++17' from Avi > Add const version of subscript operator to circular_buffer Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20171228112126.18142-1-duarte@scylladb.com>	2017-12-28 13:24:18 +02:00
Piotr Jastrzebski	570703a169	read_mutation_from_flat_mutation_reader: don't take schema_ptr Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-12-21 11:47:07 +01:00
Piotr Jastrzebski	4b58a05053	Introduce read_mutation_from_flat_mutation_reader This helper method reads a single mutation from a flat_mutation_reader. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-11-08 14:26:10 +01:00
Tomasz Grabiec	749f5770df	mutation: Introduce apply(mutation_fragment)	2017-11-02 12:16:17 +01:00
Tomasz Grabiec	1d5d5e26a2	mutation: Introduce sliced()	2017-06-24 18:06:11 +02:00
Piotr Jastrzebski	77f944880c	cache: Remove support for wide partitions This will be handled by row cache now. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-06-24 18:06:11 +02:00
Duarte Nunes	9e88b60ef5	mutation: Set cell using clustering_key_prefix Change the clustering key argument in mutation::set_cell from exploded_clustering_prefix to clustering_key_prefix, which allows for some overall code simplification and fewer copies. This mostly affects the cql3 layer. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-05-04 15:59:50 +02:00
Duarte Nunes	b27da688f9	mutation: Remove dead get_cell() function Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170316234843.23130-1-duarte@scylladb.com>	2017-03-17 11:18:23 +02:00
Tomasz Grabiec	cbf4601e31	streamed_mutation: Add non-owning variant of mutation_from_streamed_mutation()	2017-02-23 18:50:53 +01:00
Tomasz Grabiec	ddfee57c97	Replace iostream include with iosfwd in headers Message-Id: <1484656119-8386-4-git-send-email-tgrabiec@scylladb.com>	2017-01-17 14:52:44 +02:00
Asias He	e5485f3ea6	Get rid of query::partition_range Use dht::partition_range instead	2016-12-19 08:09:25 +08:00
Tomasz Grabiec	ecf85cbffb	mutation: Define + operation It's more convenient to write m1 + m2 in tests than to do more elaborate constructs with copy constructors and apply().	2016-10-18 11:16:08 +02:00
Piotr Jastrzebski	0d39bb1ad0	Implement mutation_from_streamed_mutation_with_limit If mutation is bigger than this limit it won't be read and mutation_from_streamed_mutation will return empty optional. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2016-07-21 09:35:23 +02:00
Paweł Dziepak	48e08fa997	mutation: add mutation_from_streamed_mutation() Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-20 21:29:49 +01:00
Paweł Dziepak	84713d2236	utils: extract optimized_optional<> from mutation_opt Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-20 21:29:49 +01:00
Avi Kivity	db03295c8a	Merge "Fix query digest mismatch" from Tomasz "Currently data query digest includes cells and tombstones which may have expired or be covered by higher-level tombstones. This causes digest mismatch between replicas if some elements are compacted on one of the nodes and not on others. This mismatch triggers read-repair which doesn't resolve because mutations received by mutation queries are not differing, they are compacted already. The fix adds compacting step before writing and digesting query results by reusing the algorithm used by mutation query. This is not the most optimal way to fix this. The compaction step could be folded with the query writing, there is redundancy in both steps. However such change carries more risk, and thus was postponed. perf_simple_query test (cassandra-stress-like partitions) shows regression from 83k to 77k (7%) ops/s. Fixes #1165."	2016-04-08 12:13:29 +03:00
Pekka Enberg	38a54df863	Fix pre-ScyllaDB copyright statements People keep tripping over the old copyrights and copy-pasting them to new files. Search and replace "Cloudius Systems" with "ScyllaDB". Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>	2016-04-08 08:12:47 +03:00
Tomasz Grabiec	f15c380a4f	database: Compact mutations when executing data queries Currently data query digest includes cells and tombstones which may have expired or be covered by higher-level tombstones. This causes digest mismatch between replicas if some elements are compacted on one of the nodes and not on others. This mismatch triggers read-repair which doesn't resolve because mutations received by mutation queries are not differing, they are compacted already. The fix adds compacting step before writing and digesting query results by reusing the algorithm used by mutation query. This is not the most optimal way to fix this. The compaction step could be folded with the query writing, there is redundancy in both steps. However such change carries more risk, and thus was postponed. perf_simple_query test (cassandra-stress-like partitions) shows regression from 83k to 77k (7%) ops/s. Fixes #1165.	2016-04-07 19:56:58 +02:00
Tomasz Grabiec	87d7279267	mutation: Add copy assignment operator We already have a copy constructor, so can have copy assignment as well.	2016-03-21 18:41:27 +01:00
Paweł Dziepak	82d2a2dccb	specify whether query::result, result_digest or both are needed Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-03-11 18:27:13 +00:00
Tomasz Grabiec	036974e19b	Make mutation interfaces support multiple versions Schema is tracked in memtable and cache per-entry. Entries are upgraded lazily on access. Incoming mutations are upgraded to table's current schema on given shard. Mutating nodes need to keep schema_ptr alive in case schema version is requested by target node.	2016-01-11 10:34:51 +01:00

1 2

80 Commits