Commit Graph

152 Commits

Author SHA1 Message Date
Piotr Jastrzebski
041b0a65ac Implement intrusive set using rbtree_algorithms
This new implementation takes less memory because it
does not store comparator.

It also uses tree nodes optimized for size. This means
that instead of storing an enum field |color| they embed
this information inside pointer to parent.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-01-05 11:46:58 +01:00
Piotr Jastrzebski
a0c20f5c49 mutation_partition: make apply_reversibly_intrusive_set nongeneric
apply_reversibly_intrusive_set is used only in one place
and always with rows_type. There's no need for it to be generic.
This will allow changing intrusive set implementation.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-01-05 11:26:03 +01:00
Piotr Jastrzebski
4bbe05dd47 mutation_partition: take schema in find_row and clustered_row
This will allow intrusive set implementation that does not
store schema.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-01-05 11:26:03 +01:00
Piotr Jastrzebski
fe3c91db90 mutation_partition: Extract intrusive set logic to a class.
It will make it easier to change the implementation
of the intrusive set.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-01-05 11:26:03 +01:00
Piotr Jastrzebski
da67ac7ae4 mutation_partition: Replace value_comp with key_comp calls
This will reduce the size of bi::set API being used.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2017-01-05 11:26:03 +01:00
Paweł Dziepak
40176ca2f8 mutation_partition: use result limiter for digest reads
Even if we are performing a digest query we should do proper result
memory accounting so that the result ends exactly in the same place that
it would if it was a data query. This is to avoid digest mismatches
between replicas.
2016-12-22 17:16:23 +01:00
Paweł Dziepak
38ee69dee0 idl: allow writers to use any output stream
Original IDL generated code was hardcoded to always use bytes_ostream.
This patch makes the output stream a template parameter so that any
valid output stream can be used.
Unfortunately, making IDL writers generic requires updates in the code
that uses them, this is fixed in C++17 which would be able to deduce the
parameter in most cases.
2016-12-22 13:35:04 +01:00
Paweł Dziepak
1c7cade559 mutation_partition: honour allowed_short_read for static rows 2016-12-22 13:35:04 +01:00
Piotr Jastrzebski
3e502de153 mutation_partition: don't use unique_ptr to manage LSA objects
Unique_ptr won't destruct them correctly.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <5b49bb25a962432a178fe75554dd010c3cdea41d.1482261888.git.piotr@scylladb.com>
2016-12-21 09:40:15 +01:00
Asias He
e5485f3ea6 Get rid of query::partition_range
Use dht::partition_range instead
2016-12-19 08:09:25 +08:00
Duarte Nunes
781cd82cb8 column_family: Use counters in query::result::builder
This patch changes column_family::query() to use the counters in the
builder to determine how many partitions and rows to ask for and also
to implement the stop condition. This saves a continuation to do the
bookkeeping, and allows us to remove data_query_result.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-15 10:27:46 +00:00
Duarte Nunes
05b2ef4fa2 query_result_builder: Use the underlying counters
This patch changes the query_result_builder to use the counters
provided by the query::result::builder. It also ensures they are kept
current.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-15 10:27:46 +00:00
Duarte Nunes
f5cf7f7921 mutation_partition: Count partitions in query_compacted
This patch changes mutation_partition::query_compacted() to count the
number of partitions written to the underlying writer.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-15 10:27:46 +00:00
Duarte Nunes
f21dfb8217 mutation_partition: Remove tabs in query_compacted
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-15 10:27:46 +00:00
Paweł Dziepak
ba51e7e8db data_query: limit result size
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-12-14 14:10:02 +00:00
Paweł Dziepak
f1b9f49f2b mutation_query: limit result size
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-12-14 14:10:02 +00:00
Paweł Dziepak
6c33a4f177 db: create result_memory_accounters when starting query
This pach ensures than when we start executing a query a minimum result
size is reserved from result_memory_limiter.

Moreover, range queries need a way of merging memory usage information
from different shards.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-12-14 14:10:02 +00:00
Paweł Dziepak
34f9eb4cbd mutation_compactor: honour stop_iteration from consumers
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-12-14 14:10:02 +00:00
Paweł Dziepak
43fe3439ca reconcilable_result: properly propagate short_read flag
reconcilable_result can be merged with another or transformed into
query::result. Make sure that short_read information is never lost.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-12-14 14:10:02 +00:00
Duarte Nunes
bdba8d99c3 range: Find a sequence's lower and upper bounds
This patch extracts a pair of functions from mutation_partition to
calculate the lower and upper bounds of a sequence from a
nonwrapping_range.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-11-21 11:15:04 +00:00
Paweł Dziepak
ef57b9a26f rename memory_usage() to external_memory_usage() where applicable
Renaming the function to external_memory_usage() makes it clear that
sizeof(T) is not included, something that was a source of confusion in
the past.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-11-18 11:25:36 +00:00
Paweł Dziepak
e981101fa9 Merge "Remove clustering_key_filtering_context" from Piotr
"clustering_key_filtering_context is no longer needed.
partition_slice can be used instead so this series removes
clustering_key_filtering_context and passes partition_slice down where
it's needed. Then a static get_ranges method is used to obtain
clustering key ranges for a given partition.

Fixes #1614."
2016-08-30 22:30:15 +01:00
Piotr Jastrzebski
3607d99269 Remove clustering_key_filtering_context.
Remove clustering_key_filter_factory and clustering_key_filtering_context.
Use partition_slice directly with a static get_ranges method.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2016-08-30 20:31:55 +02:00
Piotr Jastrzebski
b05b90b3a5 Introduce clustering_key_filter_ranges.
This fixes the problem of multiple concurrent get_ranges calls.
Previously each call was invalidating the result of the previous
call. Now they don't step on each other foot.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2016-08-30 19:46:38 +02:00
Paweł Dziepak
6012a7e733 mutation_partition: fix iterator invalidation in trim_rows
Reversed iterators are adaptors for 'normal' iterators. These underlying
iterators point to different objects that the reversed iterators
themselves.

The consequence of this is that removing an element pointed to by a
reversed iterator may invalidate reversed iterator which point to a
completely different object.

This is what happens in trim_rows for reversed queries. Erasing a row
can invalidate end iterator and the loop would fail to stop.

The solution is to introduce
reversal_traits::erase_dispose_and_update_end() funcion which erases and
disposes object pointed to by a given iterator but takes also a
reference to and end iterator and updates it if necessary to make sure
that it stays valid.

Fixes #1609.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1472080609-11642-1-git-send-email-pdziepak@scylladb.com>
2016-08-25 16:52:35 +03:00
Duarte Nunes
5161ea283f query: query::clustering_range can't wrap around
This patch changes the type of query::clustering_range to express that
ranges that wrap around are not allowed, and ranges that have the
start bound after the end bound are considered empty.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-08-15 14:50:20 +00:00
Duarte Nunes
ec490ffaba query_result_builder: Don't count dead partitions
With this patch we stop counting dead partitions (i.e., partitions
containing only tombstones) towards the partition limit, which
should apply only to partitions with live rows.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-08-02 21:17:06 +00:00
Duarte Nunes
21d0a2c764 query: Optionally send cell ttl
This patch adds support to send a cell's ttl as part of a query's
result. This is needed for thrift support.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-07-14 15:36:23 +02:00
Paweł Dziepak
93cc4454a6 streamed_mutation: emit range_tombstones directly
Originally, streamed_mutations guaranteed that emitted tombstones are
disjoint. In order to achieve that two separate objects were produced
for each range tombstone: range_tombstone_begin and range_tombstone_end.

Unfortunately, this forced sstable writer to accumulate all clustering
rows between range_tombstone_begin and range_tombstone_end.

However, since there is no need to write disjoint tombstones to sstables
(see #1153 "Write range tombstones to sstables like Cassandra does") it
is also not necessary for streamed_mutations to produce disjoint range
tombstones.

This patch changes that by making streamed_mutation produce
range_tombstone objects directly.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-13 09:51:18 +01:00
Tomasz Grabiec
8c4b5e4283 db: Avoiding checking bloom filters during compaction
Checking bloom filters of sstables to compute max purgeable timestamp
for compaction is expensive in terms of CPU time. We can avoid
calculating it if we're not about to GC any tombstone.

This patch changes compacting functions to accept a function instead
of ready value for max_purgeable.

I verified that bloom filter operations no longer appear on flame
graphs during compaction-heavy workload (without tombstones).

Refs #1322.
2016-07-10 09:54:20 +02:00
Paweł Dziepak
23d0bfd065 mutation_partition: add row::memory_usage()
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-07 12:17:25 +01:00
Paweł Dziepak
7a95847014 mutation_compactor: prepare for sstable compaction
compact_mutation code is going to be shared among queries and sstable
compaction. There are some differences though. Queries don't provide
_max_purgeable and sstable compaction don't need any limits.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 11:39:01 +01:00
Paweł Dziepak
4133cc7a53 mutation_reader: make consume_flattened() produce decorated keys
Since decorated keys are already computed it is better to pass more
information than less. Consumers interested just in partition key can
just drop token and the ones requiring full decorated key don't need to
recompute it.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 11:39:00 +01:00
Paweł Dziepak
3e86f9ab73 mutation_partition: extract compact_for_query to a separate header
The compacting logic inside compact_for_query is going to be shared with
sstable compaction.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 11:37:54 +01:00
Paweł Dziepak
b70bf086b7 frozen_mutation: handle reversed streams properly
Freezing streamed_mutations assumed that mutation fragments are streamed
in the order they appear in the frozen mutation. That is not true for
reversed streams.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Message-Id: <1467277069-18702-1-git-send-email-pdziepak@scylladb.com>
2016-06-30 11:26:45 +02:00
Duarte Nunes
69798df95e query: Limit number of partitions returned
This is required to implement a thrift verb.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-06-22 09:48:13 +02:00
Duarte Nunes
594e43a60a compact_query: Rename partition_limit
This patch renames compact_query::_partition_limit to
_current_partition_limit for clarity, as the next patch adds
a partition limit that limits the number of partitions.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-06-22 09:47:29 +02:00
Duarte Nunes
e9ebd87991 compact_query: Rename limit to row_limit
This patch renames compact_query::_limit to _row_limit for
clarity, as a subsequent patch introduces yet another limit.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-06-22 09:47:28 +02:00
Duarte Nunes
01b18063ea query: Add per-partition row limit
This patch as a per-partition row limit. It ensures both local
queries and the reconciliation logic abide by this limit.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-06-22 09:46:51 +02:00
Paweł Dziepak
ed12c164f8 mutation_query: make mutation queries streaming-friendly
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-20 21:31:28 +01:00
Paweł Dziepak
0828c88b25 mutation_partition: implement streaming-friendly data_query()
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-20 21:31:19 +01:00
Paweł Dziepak
67ae9457e3 mutation_partition: introduce mutation_querier
mutation_querier is a streamed_mutation consumer that adds the mutation
content to query::result.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-20 21:29:53 +01:00
Paweł Dziepak
f54e604a16 mutation_partition: introduce compact_for_query
compact_for_query is an intermediate stage used to compact data in a
flattened stream of mutations before they are consumed by query building
consumers.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-20 21:29:53 +01:00
Paweł Dziepak
f95c5542dc mutation_partition: allow slicing moved mutation_partition
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-20 21:29:51 +01:00
Paweł Dziepak
5a60f6d1ec range_tombstone: extract is_single_clustering_row_tombstone()
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-20 21:29:50 +01:00
Paweł Dziepak
847bf878ec mutation_partition: add more row::apply() overloads
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-20 21:29:48 +01:00
Duarte Nunes
70083efee2 sstables: Read and write range tombstone bounds
This patch uses the composite_marker to add inclusiveness information
to the prefixes of a range tombstone.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-06-02 16:21:59 +02:00
Duarte Nunes
7628e403a3 sstables: Drop code for tombstone merging
Since Scylla now supports proper range tombstones, the code for
reading ranges from sstables and converting them to overlapping
tombstones is no longer necessary, and is, in fact, wasteful as
the internal representation converts overlapping tombstones back to
ranges.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-06-02 16:21:59 +02:00
Duarte Nunes
95594b8171 mutations: Encapsulate row tombstones difference
This patch moves the difference between two mutation_partition's
row_tombstones inside the range_tombstone_list.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-06-02 16:21:59 +02:00
Duarte Nunes
91aac30f12 mutations: Row tombstones are now a set of ranges
This patch changes the type of the mutation partition's row_tombstones
to be a range_tombstone_list, so that they are now represented as a
set of disjoint ranges. All of its usages are updated accordingly.

Fixes #1155

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-06-02 16:21:59 +02:00