The get_multi_slice verb is used to perform multiple slices on a
single row key in one operation. It takes a set of column_slices,
which we normalize to not contain any overlapping ranges.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
This patch adds the deoverlap function to range.hh, which takes in a
vector of possibly overlapping ranges and returns a vector of
non-overlapping ranges covering the same values.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
The get_paged_slice verb is similar to the get_range_slices verb,
except that it doesn't take a SlicePredicate. Instead, it takes a
column from which to start the query.
For dynamic CFs, we use the partition_slice::specific_ranges to single
out the first partition, and query starting from the start_column row.
For static CFs, we issue an initial query to fetch the remainder of
columns from the first partition, and at least one more query to fetch
the subsequent columns until the limit is reached. This implies a
performance penalty for static CFs.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
The get_range_slices verb is similar to the multiget_slice verb,
except that it operates on a range of partition keys (or tokens).
In origin, empty partitions are returned as part of the KeySlice, for
which the key will be filled in but the columns vector will be empty.
Since in our case we don't return empty partitions, we don't know which
partition keys in the specified range we should return back to the client.
So for now, our behavior differs from Origin.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
This patch implements the multiget_count verb in a similar fashion as
multiget_slice, but using an accumulator that counts the returned
columns instead of create thrift ColumnOrSuperColumn objects.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
This patch build a query::read_command from a SlicePredicate,
for both dynamic and static column families.
For dynamic CFs, restrictions on the clustering columns are added, and
for static CFs, limits and ordering is defined inline by selecting the
correct regular columns.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
This patch adds support to send a cell's ttl as part of a query's
result. This is needed for thrift support.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
This patch adds the is_dynamic() function to thrift_schema, which
tells whether the underlying column family is dynamic or not,
according to thrift rules.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
This patch adds support for composite comparators (which, for dynamic
column families, it means composite clustering keys) and for composite
keys (composite partition keys).
Support for composite column names and regular columns is deferred,
which will entail making compound_type an abstract_type.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
"This series replaces the original scylla-help.py
It contains only a basic script that checks daily for version and report if a
newer version matched.
The script is added as a service and will be started and shutdown with
scylla-server."
Currently, for any column family, we create a directory for it in all
keyspace directories. This is incredibly awkward.
Fix by iterating over just the keyspace's column families, not all
column families in existence.
Fixes#1457.
Message-Id: <1468495182-18424-1-git-send-email-avi@scylladb.com>
The check version script uses the python requests package, this add the
dependency to the ubuntu package.
Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Ununtu 14.4 upstart does not support timers for recurrent operations.
The upstart cookbook suggest a way to mimic this functionality here:
http://upstart.ubuntu.com/cookbook/#run-a-job-periodically
This patch adds a service that runs the house-keeping daily.
Setting it as a service insure that it would start and stop with
scylla-server service.
filter_for_query() gets sorted by preference list of endpoints and
should preserve that order after filtering out non local endpoints for
local query. partition() does not guaranty this while stable_partition()
does, so use it instead.
Fixes#1450.
Message-Id: <20160713100909.GM10767@scylladb.com>
From Paweł:
This is another episode in the "convert X to streamed mutations" series.
Hashing mutations (mainly for repair) is converted so that it doesn't
need to rebuild whole mutation.
The first part of the series changes the way streamed mutations deal
with range tombstones. Since it is not necessary to make sure we write
disjoint tombstones to sstables there is no need anymore for streamed
mutations to produce disjoint tombstones and, consequently, no need for
range tombstones to be split into range_tombstone_begin and
range_tombstone_end.
The second part is the actual hashing implementation. However, to ensure
that the hash depends only on the contents of the mutation and no the
way it is stored in different data sources range tombstones have to be
made disjoint before they are hashed.
This series also ensures that any changes caused by streamed mutations
to hashing and streaming do not break repair during upgrade.
This patch makes hashing for repair calculate checksums in a way that
doesn't require rebuilding whole mutation.
Unfortunately, such checksums are incompatible with the old ones so the
old way for computing checksums is preserved for compatibility reasons.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
The receiving side needs to handle fragmented mutations properly so that
isolation guarantees are not broken. If the receiving node may be an old
one do not fragment mutations.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
mutation_hasher is a consumer of streamed_mutation that feeds its data
to a specified hasher.
It is not compatible with hashing_partition_visitor.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
Originally, streamed_mutations guaranteed that emitted tombstones are
disjoint. In order to achieve that two separate objects were produced
for each range tombstone: range_tombstone_begin and range_tombstone_end.
Unfortunately, this forced sstable writer to accumulate all clustering
rows between range_tombstone_begin and range_tombstone_end.
However, since there is no need to write disjoint tombstones to sstables
(see #1153 "Write range tombstones to sstables like Cassandra does") it
is also not necessary for streamed_mutations to produce disjoint range
tombstones.
This patch changes that by making streamed_mutation produce
range_tombstone objects directly.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
range_tombstone::flip() flips range bounds. This is necessary in order
to use range tombstone in reversed mutation fragment streams.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
range_tombstone_accumulator is a helper class that allows determining
tombstone for a clustering row when range tombstones and clustering rows
are streamed from streamed_mutation.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
range_tombstone::apply() allows merging two, possibly overlapping, range
tombstones with the same start bound and produces one or two disjoint
range tombstones as a result.
It is intended to be used for merging tombstones coming from different
sources.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
The sstable parsing code calls mp_row_consumer::flush() after every
clustering row has been read, and this puts the now complete row in a single
field "_ready". The assumption is that at this point parsing will stop, the
consumer will move out this _ready (mp_row_consumer::get_mutation_fragment())
and when flush() is later called again, _ready will be empty again.
This assumption is correct in our code, but is based on an intricate
combination of estoreric parts of the code, such as:
1. In data_consume_row_context we stop parsing after reading the parition's
header, before reading any clustering rows, giving the caller the chance
to call sstable_streamed_mutation::read_next() to be prepared for the
incoming mutations.
2. In mp_row_consumer::flush_if_needed(), we stop the parser after each
individual clustering row.
It is easy to break this assumption, and I did this in one of my code changes,
and the result was silent loss of clustering rows, as "_ready" got silently
overwritten before the reader had a chance to move it out.
What this patch does is to add an assertion: If a clustering row is silently
lost before being transferred to the mutation fragment reader, we croak.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1468389955-24600-1-git-send-email-nyh@scylladb.com>