scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-01 21:55:50 +00:00

Author	SHA1	Message	Date
Glauber Costa	5140aaea00	add a timeout to fast forward to In the last patch, we enabled per-request timeouts, we enable timeouts in fill_buffer. There are many places, though, in which we fast_forward_to before we fill_buffer, so in order to make that effective we need to propagate the timeouts to fast_forward_to as well. In the same way as fill_buffer, we make the argument optional wherever possible in the high level callers, making them mandatory in the implementations. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-01-12 07:43:19 -05:00
Glauber Costa	d965af42b0	add a timeout to fill_buffer As part of the work to enable per-request timeouts, we enable timeouts in fill_buffer. The argument is made optional at the main classes, but mandatory in all the ::impl versions. This way we'll make sure we didn't forget anything. At this point we're still mostly passing that information around and don't have any entity that will act on those timeouts. In the next patch we will wire that up. Signed-off-by: Glauber Costa <glauber@scylladb.com>	2018-01-11 12:07:41 -05:00
Avi Kivity	8795238869	Merge "Fix handling of range tombstones starting at same position" from Tomasz "When we get two range tombstones with the same lower bound from different data sources (e.g. two sstable), which need to be combined into a single stream, they need to be de-overlapped, because each mutation fragment in the stream must have a different position. If we have range tombstones [1, 10) and [1, 20), the result of that de-overlapping will be [1, 10) and [10, 20]. The problem is that if the stream corresponds to a clustering slice with upper bound greater than 1, but lower than 10, the second range tombstone would appear as being out of the query range. This is currently violating assumptions made by some consumers, like cache populator. One effect of this may be that a reader will miss rows which are in the range (1, 10) (after the start of the first range tombstone, and before the start of the second range tombstone), if the second range tombstone happens to be the last fragment which was read for a discontinuous range in cache and we stopped reading at that point because of a full buffer and cache was evicted before we resumed reading, so we went to reading from the sstable reader again. There could be more cases in which this violation may resurface. There is also a related bug in mutation_fragment_merger. If the reader is in forwarding mode, and the current range is [1, 5], the reader would still emit range_tombstone([10, 20]). If that reader is later fast forwarded to another range, say [6, 8], it may produce fragments with smaller positions which were emitted before, violating monotonicity of fragment positions in the stream. A similar bug was also present in partition_snapshot_flat_reader. Possible solutions: 1) relax the assumption (in cache) that streams contain only relevant range tombstones, and only require that they contain at least all relevant tombstones 2) allow subsequent range tombstones in a stream to share the same starting position (position is weakly monotonic), then we don't need to de-overlap the tombstones in readers. 3) teach combining readers about query restrictions so that they can drop fragments which fall outside the range 4) force leaf readers to trim all range tombstones to query restrictions This patch implements solution no 2. It simplifies combining readers, which don't need to accumulate and trim range tombstones. I don't like solution 3, because it makes combining readers more complicated, slower, and harder to properly construct (currently combining readers don't need to know restrictions of the leaf streams). Solution 4 is confined to implementations of leaf readers, but also has disadvantage of making those more complicated and slower. There is only one consumer which needs the tombstones with monotonic positions, and that is the sstable writer. Fixes #3093." * tag 'tgrabiec/fix-out-of-range-tombstones-v1' of github.com:scylladb/seastar-dev: tests: row_cache: Introduce test for concurrent read, population and eviction tests: sstables: Add test for writing combined stream with range tombstones at same position tests: memtable: Test that combined mutation source is a mutation source tests: memtable: Test that memtable with many versions is a mutation source tests: mutation_source: Add test for stream invariants with overlapping tombstones tests: mutation_reader: Test fast forwarding of combined reader with overlapping range tombstones tests: mutation_reader: Test combined reader slicing on random mutations tests: mutation_source_test: Extract random_mutation_generator::make_partition_keys() mutation_fragment: Introduce range() clustering_interval_set: Introduce overlaps() clustering_interval_set: Extract private make_interval() mutation_reader: Allow range tombstones with same position in the fragment stream sstables: Handle consecutive range_tombstone fragments with same position tests: streamed_mutation_assertions: Merge range_tombstones with the same position in produces_range_tombstone() streamed_mutation: Introduce peek() mutation_fragment: Extract mergeable_with() mutation_reader: Move definition of combining mutation reader to source file mutation_reader: Use make_combined_reader() to create combined reader	2018-01-02 18:32:09 +02:00
Duarte Nunes	1374f898b9	Merge seastar upstream Class optimized_optional was moved into seastar, and its usage simplified so move_and_disengage() is replaced in favour of std::exchange(_, { }). * seastar adaca37...b0f5591 (9): > Merge "core: Introduce cancellation mechanism" from Duarte > Fix Seastar build that no longer builds with --enable-dpdk after the recent commit fd87ea2 > noncopyable_function: support function objects whose move constructors throw > Adding new hardware options to new config format, using new config format for dpdk device > Fix check for Boost version during pre-build configuration. > variant_utils: add variant_visitor constructor for C++17 mode > Merge "Allows json object to be stream to an" from Amnon > Merge 'Default to C++17' from Avi > Add const version of subscript operator to circular_buffer Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20171228112126.18142-1-duarte@scylladb.com>	2017-12-28 13:24:18 +02:00
Tomasz Grabiec	2be3cbbb81	mutation_fragment: Introduce range()	2017-12-22 11:06:33 +01:00
Tomasz Grabiec	41ede08a1d	mutation_reader: Allow range tombstones with same position in the fragment stream When we get two range tombstones with the same lower bound from different data sources (e.g. two sstable), which need to be combined into a single stream, they need to be de-overlapped, because each mutation fragment in the stream must have a different position. If we have range tombstones [1, 10) and [1, 20), the result of that de-overlapping will be [1, 10) and [10, 20]. The problem is that if the stream corresponds to a clustering slice with upper bound greater than 1, but lower than 10, the second range tombstone would appear as being out of the query range. This is currently violating assumptions made by some consumers, like cache populator. One effect of this may be that a reader will miss rows which are in the range (1, 10) (after the start of the first range tombstone, and before the start of the second range tombstone), if the second range tombstone happens to be the last fragment which was read for a discontinuous range in cache and we stopped reading at that point because of a full buffer and cache was evicted before we resumed reading, so we went to reading from the sstable reader again. There could be more cases in which this violation may resurface. There is also a related bug in mutation_fragment_merger. If the reader is in forwarding mode, and the current range is [1, 5], the reader would still emit range_tombstone([10, 20]). If that reader is later fast forwarded to another range, say [6, 8], it may produce fragments with smaller positions which were emitted before, violating monotonicity of fragment positions in the stream. A similar bug was also present in partition_snapshot_flat_reader. Possible solutions: 1) relax the assumption (in cache) that streams contain only relevant range tombstones, and only require that they contain at least all relevant tombstones 2) allow subsequent range tombstones in a stream to share the same starting position (position is weakly monotonic), then we don't need to de-overlap the tombstones in readers. 3) teach combining readers about query restrictions so that they can drop fragments which fall outside the range 4) force leaf readers to trim all range tombstones to query restrictions This patch implements solution no 2. It simplifies combining readers, which don't need to accumulate and trim range tombstones. I don't like solution 3, because it makes combining readers more complicated, slower, and harder to properly construct (currently combining readers don't need to know restrictions of the leaf streams). Solution 4 is confined to implementations of leaf readers, but also has disadvantage of making those more complicated and slower. Fixes #3093.	2017-12-22 11:06:20 +01:00
Tomasz Grabiec	815cd254e2	streamed_mutation: Introduce peek() Will be used in assertions to merge consecutive range tombstones.	2017-12-21 22:45:35 +01:00
Tomasz Grabiec	c5f82aa5bd	mutation_fragment: Extract mergeable_with()	2017-12-21 21:24:11 +01:00
Botond Dénes	e47791810b	Add non-const overload of partition_start::partition_tombstone() And make the const version return a const reference so that code mutating the returned value won't compile if the partition_start object is const.	2017-12-04 07:57:43 +02:00
Paweł Dziepak	9b39d3b023	streamed_mutation: drop streamed_mutation_returning()	2017-11-23 18:14:31 +00:00
Paweł Dziepak	8baf682216	streamed_mutation: drop reverse_streamed_mutation()	2017-11-21 11:37:04 +00:00
Avi Kivity	09e730f9f2	Merge "Fix bugs in cache related to handling of bad_alloc" from Tomasz "Fixes #2944." * tag 'tgrabiec/cache-exception-safety-fixes-v2' of github.com:scylladb/seastar-dev: tests: row_cache: Add test for exception safety of multi-partition scans tests: row_cache: Add test for exception safety of single-partition reads tests: mutation_source_tests: Always print the seed tests: Disable alloc failure injection in test assertions tests: Avoid needless copies row_cache: Fix exception safety of cache_entry::read() row_cache: scanning_and_populating_reader: Fix exception unsafety causing read to skip data row_cache: partition_range_cursor: Extract valid() and advance_to() from refresh() cache_streamed_mutation: Add trace-level logging to cache_streamed_mutation mvcc: Lift noexcept off partition_snapshot_row_weakref assignment/constructors cache_streamed_mutation: Make advancing to the next range exception-safe cache_streamed_mutation: Make add_clustering_row_to_buffer() exception-safe cache_streamed_mutation: Make drain_tombstones() exception-safe cache_streamed_mutation: Return void from start_reading_from_underlying() cache_streamed_mutation: Document invariants related to exception-safety streamed_mutation: Add reserve_one() lsa: Guarantee invalidated references on allocating section retry mvcc: partition_snapshot_row_cursor: Mark allocation points	2017-11-14 11:42:13 +02:00
Tomasz Grabiec	53f4452b47	streamed_mutation: Add reserve_one()	2017-11-13 20:55:13 +01:00
Paweł Dziepak	7866e5b4a9	streamed_mutation: drop mutation_hasher	2017-11-13 16:49:52 +00:00
Paweł Dziepak	af4fa6152b	partition_start: make partition_tombstone() const	2017-11-13 16:49:52 +00:00
Piotr Jastrzebski	1c9e4ba04c	Add flat_mutation_reader_assertions This will be usefull in tests. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-11-08 12:58:31 +01:00
Piotr Jastrzebski	f325fef362	Extract FlattenedConsumer concept using GCC6_CONCEPT This concept will be used in flat_mutation_reader::consume Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-10-10 16:15:59 +02:00
Piotr Jastrzebski	46727f12e0	Introduce partition_end mutation_fragment This type of mutation_fragment will be used in new mutation_reader to signal the end of the current partition. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-10-10 16:15:59 +02:00
Piotr Jastrzebski	2516b42752	Introduce partition_start mutation_fragment This type of mutation_fragment will be used in new mutation_reader to signal the beginning of the next partition. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-10-10 16:15:59 +02:00
Piotr Jastrzebski	1f4fb6dd4a	Introduce FragmentConsumer This concept helps define StreamedMutationConsumer. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-10-10 16:05:44 +02:00
Piotr Jastrzebski	e1f7d1f25d	streamed_mutation: Extract concepts using GCC6_CONCEPT macro It makes it easier to actually use those concepts. Lambdas passed to mutation_fragment::visit have to declare return type otherwise compiler fails with: internal compiler error: Segmentation fault Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-09-20 11:34:03 +02:00
Tomasz Grabiec	cb16b038ef	streamed_mutation: Allow setting buffer capacity Needed in tests to limit amount of prefetching done by readers, so that it's easier to test interleaving of various events.	2017-09-13 17:47:03 +02:00
Piotr Jastrzebski	477068d2c3	Make streamed_mutation more exception safe Make sure that push_mutation_fragment leaves _buffer_size with a correct value if exception is thrown from emplace_back. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Message-Id: <83398412aa78332d88d91336b79140aecc988602.1503474403.git.piotr@scylladb.com>	2017-08-23 09:37:04 +01:00
Paweł Dziepak	f02bef7917	streamed_mutation: do not call fill_buffer() ahead of time consume_mutation_fragments_until() allows consuming mutation fragments until a specified condition happens. This patch reorganises its implementation so that we avoid situations when fill_buffer() is called with stop condition being true. Message-Id: <20170727122218.7703-1-pdziepak@scylladb.com>	2017-07-27 17:47:57 +02:00
Paweł Dziepak	c9ccd813ab	mutation_fragment: make destructor always_inline mutation_fragment destructor was already made inline-friendly by moving most of the logic to a separate function. However, the compiler still is quite reluctant to inline it in certain cases, so let's give it a stronger hint.	2017-07-26 14:38:27 +01:00
Paweł Dziepak	2066354de3	streamed_mutation: introduce consume_mutation_fragments_until() consume_mutation_fragments_until() is a consumer based interface that avoids indirect calls and continuation overhead present in the naive streamed_mutation::operator() approach.	2017-07-26 14:37:20 +01:00
Tomasz Grabiec	1f23130b07	mutation_fragment: Implement equality check	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	04aebaa2cb	streamed_mutation: Introduce transform()	2017-06-24 18:06:11 +02:00
Piotr Jastrzebski	a17fa5726f	Introduce streamed_mutation_from_forwarding_streamed_mutation This will allow conversion from streamed_mutation that supports fast forwarding to streamed_mutation that does not. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	11191b7aef	streamed_mutation: Introduce make_empty_streamed_mutation()	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	1594ace4d3	range_tombstone_stream: Make printable	2017-06-24 18:06:11 +02:00
Piotr Jastrzebski	5a29c70f3e	mutation_fragment: make mutation_fragment copyable This will be needed by implementation of cache_streamed_mutation Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-06-24 18:06:11 +02:00
Piotr Jastrzebski	94c957c2ff	Extract position_in_partition to separate header This will allow it's usage in mutation_partition.hh Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-06-24 18:06:11 +02:00
Piotr Jastrzebski	0fd4dedc6a	position_in_partition: Add after_all_clustered_rows() to view This is a position that's always in the end after any other position. It will be used for dummy rows_entry. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2017-06-24 18:06:11 +02:00
Nadav Har'El	984da1d8d7	Make forwarding_tag local to streamed_mutation As Avi noticed, the "forwarding_tag" which was meant to be local in streamed_mutation, became global. If another class copied the same trick, it would share the same type instead of being distinct types as intended. The problem is that in: using forwarding = bool_class<class forwarding_tag>; Apparently, the "class forwarding_tag" forward-declares a global type - it does not create a local-scope type as intended, which the following apparently does (even though no actual definition is given for that class): class forwarding_tag; using forwarding = bool_class<forwarding_tag>; Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20170619153933.13116-1-nyh@scylladb.com>	2017-06-19 20:04:47 +03:00
Avi Kivity	ebaeefa02b	Merge seatar upstream (seastar namespace) - introcduced "seastarx.hh" header, which does a "using namespace seastar"; - 'net' namespace conflicts with seastar::net, renamed to 'netw'. - 'transport' namespace conflicts with seastar::transport, renamed to cql_transport. - "logger" global variables now conflict with logger global type, renamed to xlogger. - other minor changes	2017-05-21 12:26:15 +03:00
Tomasz Grabiec	e56711a54d	sstables: mutation_reader: Avoid reading index when restrictions cover whole partition The check for is_static_row() used to be enough, but it no longer is after optimization made in commit `3e06065`, which avoids reading the static row. Message-Id: <1494241164-25810-1-git-send-email-tgrabiec@scylladb.com>	2017-05-09 11:03:18 +01:00
Duarte Nunes	4e693383f7	mutation_partion: Use row_tombstone This patch replaces the current row tombstone representation by a row_tombstone. The intent of the patch is thus to reify the idea of shadowable tombstones, that up until now we considered all materialized view row tombstones to be. We need to distinguish shadowable from non-shadowable row tombstones to support scenarios such as, when inserting to a table with a materialzied view: 1. insert into base (p, v1, v2) values (3, 1, 3) using timestamp 1 2. delete from base using timestamp 2 where p = 3 3. insert into base (p, v1) values (3, 1) using timestamp 3 These should yield a view row where v2 is definitely null, but with the current implementation, v2 will pop back with its value v2=3@TS=1, even though its dead in the base row. This is because the row tombstone inserted at 2) is a shadowable one. This patch only addresses the memory representation of such row_tombstones. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-04-25 11:46:33 +02:00
Tomasz Grabiec	c85fe3183c	position_range: Allow stealing of bounds	2017-04-20 10:54:36 +02:00
Tomasz Grabiec	503c68de44	position_in_partition: Add more factory methods	2017-04-20 10:54:36 +02:00
Tomasz Grabiec	5b813898bc	position_range: Introduce all_clustered_rows() factory method	2017-03-28 18:10:39 +02:00
Tomasz Grabiec	622713be60	position_in_partition: Introduce for_key()/after_key() factory methods	2017-03-28 18:10:39 +02:00
Tomasz Grabiec	e27fa712f5	position_in_partition: Add factory methods for positions around all rows	2017-03-28 18:10:39 +02:00
Tomasz Grabiec	b90275f8e3	position_in_partition: Introduce for_range_start()/for_range_end()	2017-03-28 18:10:39 +02:00
Tomasz Grabiec	5c29f4dd04	position_in_partition: Fix friendship declaration	2017-03-28 18:10:39 +02:00
Tomasz Grabiec	2ff6c1705b	position_in_partition: Make comparable with composites	2017-03-28 18:10:39 +02:00
Tomasz Grabiec	dc7b93a326	streamed_mutation: Add streamed_mutation_returning() helper	2017-03-10 14:42:22 +01:00
Tomasz Grabiec	a32cf6c4cc	position_in_partition: Add is_static_row() in the view	2017-03-10 14:42:21 +01:00
Tomasz Grabiec	e4db643730	range_tombstone_stream: Add reset()	2017-03-10 14:42:21 +01:00
Tomasz Grabiec	48ad2e2d64	range_tombstone_stream: Add get_next(position_in_partition_view)	2017-03-10 14:42:21 +01:00

1 2

91 Commits