This patch fixes a regression introduced in
f81329be60, which made keys compound by
default when using a particular ctor, in turn leading to mismatches
when comparing the same key built with functions that properly
consider compoundness.
As a temporary fix, the sstable::key and sstable::key_view classes
store raw bytes instead of a composite.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1468339295-3924-1-git-send-email-duarte@scylladb.com>
The sstables::key class now delegates much of its functionality
to the composite class. All existing behavior is preserved.
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
The check_marker() function is use as a sanity-check of data we read
from sstable, so instead of the header file key.hh, let's move it to
the sstable-parsing source file partition.cc.
In addition to having less code in header files, another benefit is
that the function can now throw a more specific exception (malformed
sstable exception).
Also fixed the exception's message (which had a second "%d" but only
one parameter).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <1459420430-5968-1-git-send-email-nyh@scylladb.com>
Until recently, we believed that range tombstones we read from sstables will
always be for entire rows (or more generalized clustering-key prefixes),
not for arbitrary ranges. But as we found out, because Cassandra insists
that range tombstones do not overlap, it may take two overlapping row
tombstones and convert them into three range tombstones which look like
general ranges (see the patch for a more detailed example).
Not only do we need to accept such "split" range tombstones, we also need
to convert them back to our internal representation which, in the above
example, involves two overlapping tombstones. This is what this patch does.
This patch also contains a test for this case: We created in Cassandra
an sstable with two overlapping deletions, and verify that when we read
it to Scylla, we get these two overlapping deletions - despite the
sstable file actually having contained three non-overlapping tombstones.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <b7c07466074bf0db6457323af8622bb5210bb86a.1459399004.git.glauber@scylladb.com>
As Nadav noticed in his bug report, check_marker is creating its error messages
using characters instead of numbers - which is what we intended here in the
first place.
That happens because sprint(), when faced with an 8-byte type, interprets this
as a character. To avoid that we'll use uint16_t types, taking care not to
sign-extend them.
The bug also noted that one of the error messages is missing a parameter, and
that is also fixed.
Fixes#1122
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <74f825bbff8488ffeb1911e626db51eed88629b1.1459266115.git.glauber@scylladb.com>
Using a partition_key_view can save an allocation in some cases. We will
make use of it when we linearize a partition_key; during the process we
are given a simple byte pointer, and constructing a partition_key from that
requires an allocation.
Schemas using compact storage can have clustering keys with the trailing
components not set and effectively being a clustering key prefixes
instead of full clustering keys.
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
We use boost::any to convert to and from database values (stored in
serlialized form) and native C++ values. boost::any captures information
about the data type (how to copy/move/delete etc.) and stores it inside
the boost::any instance. We later retrieve the real value using
boost::any_cast.
However, data_value (which has a boost::any member) already has type
information as a data_type instance. By teaching data_type intances about
the corresponding native type, we can elimiante the use of boost::any.
While boost::any is evil and eliminating it improves efficiency somewhat,
the real goal is growing native type support in data_type. We will use that
later to store native types in the cache, enabling O(log n) access to
collections, O(1) access to tuples, and more efficient large blob support.
Some version of Origin will write 0 instead of -1 as the start of range marker
for a range tombstone. I've just came across one of such tables, that ended up
breaking our code. Let's be more flexible in what we accept. We don't really have
a choice.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
We always return a future, but with the threaded writer, we can get rid of
that. So while reads will still return a future, the writer will be able to
return void.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
We can insert markers in the end of composites, which can be used to identify
the presence of ranges in a column.
One option, would be to change all methods in sstables/key.hh to take an
optional marker parameter, and append that as the last marker.
But because we are talking about a single byte, and always added to the end,
it's a lot easier to allow the composite to be created normally, and then
replace the last byte with the marker.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
This way, we don't need to rely on an external byte_view conversion to write
this element. Note that because we don't have a writer to const byte&, we will
cast away the qualifier.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Encapsulating it into a composite class is a more natural way to handle this,
and will allow for extending that class later.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Composites have an extra byte at the end of each element. The middle bytes
are expected to be zero, but the last one may not always be. In cases like
those of a range tombstone, we will have a marker with special significance.
Since we are exploding the composite into its components, we don't actually
need to retrieve it. We merely need to make sure that the marker is the one
we expect.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Read mutations from sstables, from Glauber.
Conflicts:
sstables/key.cc
sstables/key.hh
sstables/sstables.cc
sstables/sstables.hh
[avi: adjust sstables/partition.cc:149 for ambiguity due to new
basic_sstring range constructor]
from_components() is made more generic so as to be re-used for
the creation of a composite from a clustering_key.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Reviewed-by: Nadav Har'El <nyh@cloudius-systems.com>
We have our own representation of a partition_key, clustering_key, etc. They
may different slightly from a legacy sstable key because we don't necessarily
serialize composites in our internal representation the same way as Origin
does. This patch encodes the Origin composite serialization, so we can create
keys that are compatible with Origin's understanding of what a partition key
should look like.
This whould be used when serializing or deserializing to/from an sstable.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>