scylladb

Author	SHA1	Message	Date
Paweł Dziepak	27014a23d7	treewide: require type info for copying atomic_cell_or_collection	2018-05-31 15:51:11 +01:00
Duarte Nunes	67dac67c46	mutation_partition: Regular base column in view determines row liveness When views contain a primary key column that is not part of the base table primary key, that column determines whether the row is live or not. We need to ensure that when that cell is dead, and thus the derived row marker, either by normal deletion of by TTL, so is the rest of the row. This patch introduces the idea of shawdowing row marker. We map the status of the regular base column in the view's PK to the view row's marker. If this marker is dead, so is that cell in the base table, and so should the view row become. To enforce that, a view row's dead marker shadows the whole row if that view includes a base regular column in its PK. Fixes #3360 Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-04-23 09:32:02 +01:00
Botond Dénes	7a5143a670	Add querier The querier encapsulates all objects needed to serve queries, except result builders. It is designed to be suspendable, savable and resumable. It contains all logic needed to suspend, resume and determine whether the querier can be resumed or not. It is the foundation upon which the "reader-reuse" mechanism is built.	2018-03-13 10:34:34 +02:00
Botond Dénes	84d872babf	Add are_limits_reached() compact_mutation_state are_limits_reached() allows querying whether the compactor reached the page's limits. This is needed to determine whether there will be more pages and thus whether the compact_mutation_state has to be kept around.	2018-03-13 10:34:34 +02:00
Botond Dénes	2c1081b0e9	Add start_new_page() to compact_mutation_state start_new_page() resets the limits to the current page's ones and sets the _empty_partition flag so that the partition header (if the last page finished inside a partition) will be reemitted.	2018-03-13 10:34:34 +02:00
Botond Dénes	3fca8aaefb	Save last key of the page and method to query it Make a copy of the current decorated-key in consume_end_of_stream() so that it persists while the compaction state is suspended. Also add current_partition() to allow client code to query the partition the compaction is positioned in. This is needed to determine whether the start position of the next page matches that of the compact_mutation_state.	2018-03-13 10:34:34 +02:00
Botond Dénes	2fcc99fe43	Make compact_mutation reusable Currently compact_mutation is used as a use-once-then-throw-away object. After it satisfies its consumer it's destroyed together with the consumer. This conflicts with the effort to save and reuse readers and associated infrastructure between pages of a query. To resolve this conflict compact_mutation is split into two classes: (1) compact_mutation_state (2) compact_mutation compact_mutation_state encapsulates all the compaction logic and state, while compact_mutation continues to provide the same API using compact_mutation_state behind the scenes. compact_mutation_state doesn't store the consumer, instead its consume_* methods are templated on the consumer and take it as an argument. This allows compact_mutation_state to be independent of the consumer's type. Additionally compact_mutation can now be constructed from a shared pointer to compact_mutation_state. This allows client code to pre-construct a compaction state and retain it after the compact_mutation object is destroyed. These changes allow the state of a compaction to be saved and restored later while code that is only interested in storing the saved state can stay independent of the consumer's type. This patch only contains the splitting of compact_mutation into compact_mutation and compact_mutation_state. The next patches will add the missing functionality that is needed to make compact_mutation_state truly reusable across pages.	2018-03-13 10:34:34 +02:00
Botond Dénes	7bd500049d	Add the CompactedFragmentsConsumer Undust the commented CompactMutationConsumer concept, make it usable and rename it to CompactedFragmentsConsumer (as we not have flat readers).	2018-03-13 10:34:34 +02:00
Piotr Jastrzebski	96c97ad1db	Rename streamed_mutation* files to mutation_fragment* Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>	2018-01-24 20:56:49 +01:00
Duarte Nunes	baeec0935f	Replace query::full_slice with schema::full_slice() query::full_slice doesn't select any regular or static columns, which is at odds with the expectations of its users. This patch replaces it with the schema::full_slice() version. Refs #2885 Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <1507732800-9448-2-git-send-email-duarte@scylladb.com>	2017-10-17 11:25:53 +02:00
Duarte Nunes	c7aa3ea069	mutation_partition: Remove obsolete short read detection When compacting a partition for querying we would read an extra row, to include any tombstones between that one and the previous row. This is no longer needed since we have a general mechanism to detect short reads in the storage_proxy. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20170811103031.22866-1-duarte@scylladb.com>	2017-08-15 12:01:55 +01:00
Duarte Nunes	4e693383f7	mutation_partion: Use row_tombstone This patch replaces the current row tombstone representation by a row_tombstone. The intent of the patch is thus to reify the idea of shadowable tombstones, that up until now we considered all materialized view row tombstones to be. We need to distinguish shadowable from non-shadowable row tombstones to support scenarios such as, when inserting to a table with a materialzied view: 1. insert into base (p, v1, v2) values (3, 1, 3) using timestamp 1 2. delete from base using timestamp 2 where p = 3 3. insert into base (p, v1) values (3, 1) using timestamp 3 These should yield a view row where v2 is definitely null, but with the current implementation, v2 will pop back with its value v2=3@TS=1, even though its dead in the base row. This is because the row tombstone inserted at 2) is a shadowable one. This patch only addresses the memory representation of such row_tombstones. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-04-25 11:46:33 +02:00
Tomasz Grabiec	4b6e77e97e	db: Fix overflow of gc_clock time point If query_time is time_point::min(), which is used by to_data_query_result(), the result of subtraction of gc_grace_seconds() from query_time will overflow. I don't think this bug would currently have user-perceivable effects. This affects which tombstones are dropped, but in case of to_data_query_result() uses, tombstones are not present in the final data query result, and mutation_partition::do_compact() takes tombstones into consideration while compacting before expiring them. Fixes the following UBSAN report: /usr/include/c++/5.3.1/chrono:399:55: runtime error: signed integer overflow: -2147483648 - 604800 cannot be represented in type 'int' Message-Id: <1488385429-14276-1-git-send-email-tgrabiec@scylladb.com>	2017-03-01 18:49:56 +02:00
Paweł Dziepak	34f9eb4cbd	mutation_compactor: honour stop_iteration from consumers Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-12-14 14:10:02 +00:00
Duarte Nunes	167e400ca8	compact_mutation: Don't count dead partitions With this patch we stop counting dead partitions (i.e., partitions containing only tombstones) towards the partition limit, which should apply only to partitions with live rows. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-08-02 21:17:06 +00:00
Paweł Dziepak	93cc4454a6	streamed_mutation: emit range_tombstones directly Originally, streamed_mutations guaranteed that emitted tombstones are disjoint. In order to achieve that two separate objects were produced for each range tombstone: range_tombstone_begin and range_tombstone_end. Unfortunately, this forced sstable writer to accumulate all clustering rows between range_tombstone_begin and range_tombstone_end. However, since there is no need to write disjoint tombstones to sstables (see #1153 "Write range tombstones to sstables like Cassandra does") it is also not necessary for streamed_mutations to produce disjoint range tombstones. This patch changes that by making streamed_mutation produce range_tombstone objects directly. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-07-13 09:51:18 +01:00
Tomasz Grabiec	8c4b5e4283	db: Avoiding checking bloom filters during compaction Checking bloom filters of sstables to compute max purgeable timestamp for compaction is expensive in terms of CPU time. We can avoid calculating it if we're not about to GC any tombstone. This patch changes compacting functions to accept a function instead of ready value for max_purgeable. I verified that bloom filter operations no longer appear on flame graphs during compaction-heavy workload (without tombstones). Refs #1322.	2016-07-10 09:54:20 +02:00
Tomasz Grabiec	fb44f895b2	mutation_reader: Name template parameters after concepts With so many consumer concepts out there, it is confusing to name parameters using genering "Consumer" name, let's name them after (already defined) concepts: CompactedMutationsConsumer, FlattenedConsumer.	2016-07-09 22:31:27 +02:00
Paweł Dziepak	7a95847014	mutation_compactor: prepare for sstable compaction compact_mutation code is going to be shared among queries and sstable compaction. There are some differences though. Queries don't provide _max_purgeable and sstable compaction don't need any limits. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-30 11:39:01 +01:00
Paweł Dziepak	00bcc05d36	mutation_compactor: _max_purgeable depends on the decorated key _max_perguable can be different for each partition, since it is computed using sstables in which that partition is present (or likely to be present). Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-30 11:39:01 +01:00
Paweł Dziepak	4133cc7a53	mutation_reader: make consume_flattened() produce decorated keys Since decorated keys are already computed it is better to pass more information than less. Consumers interested just in partition key can just drop token and the ones requiring full decorated key don't need to recompute it. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-30 11:39:00 +01:00
Paweł Dziepak	fe4b739828	mutation_compactor: rename compact_for_query to compact_mutation Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-30 11:37:54 +01:00
Paweł Dziepak	3e86f9ab73	mutation_partition: extract compact_for_query to a separate header The compacting logic inside compact_for_query is going to be shared with sstable compaction. Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-30 11:37:54 +01:00

23 Commits