Two ring_positions are equal if tokens and keys are equal or tokens are
equal and one or both of them do not specify key. So ring_positions
without a key is a wildcard that equals any ring_positions with the same
token.
Some of the tests in DTEST take advantage of the fact that
ByteOrderedPartitioner guarantees certain ordering of partition keys.
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
We need to be able to do it so we can, among other things, create CQL
statements that include the current state of the tokens.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
It is a better fit for things that are names, not blobs. We have a user that expects
a bytes parameter, but that is for no other reason than the fact that the field used
to be of bytes type.
Let's fix that, and future users will be able to use sstrings
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
midpoint(l, r) where l > r needs to wrap around the end of the ring. Adjust
the midpoint() function to do this.
Note this is still broken for the murmur3 partitioner, since it doesn't treat
tokens as unsigned.
In sstables the paritioner name is store for validation. To allow Origin
to process our files we need to comply with Origin's paritioner name or
else Origin's SSTableReader::open fails on paritioner comparison.
We do not care about the order of the tokens.
Also, in token_metadata, we use unordered_set for tokens as well, e.g.
update_normal_tokens. Unify the usage.
Validation metadata stores partitioner name and bloom filter chance.
Cassandra gets the partitioner name by getting a object of the class
itself and getting its canonical name.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Use decorated_key in partition maps, from Tomasz:
"Partitions should be ordered using Origin's ordering, which is the natural
ordering of decorated_key. This is achieved by switching column_family's
partition map to use decorated_key instead of a bare partition_key.
This also includes some cleanups."
[avi] trivial adjustments to sstables/keys.cc
It has been determined that we will store partition_keys in the decorated_keys.
That is totally fine, but the token needs to be generatable from an sstable::key
as well.
Since both types convert well to a bytes_view - and the first thing get_token() does
is precisely to generate that view, let's generate the token from a bytes_view instead.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
bytes and sstring are distinct types, since their internal buffers are of
different length, but bytes_view is an alias of sstring_view, which makes
it possible of objects of different types to leak across the abstraction
boundary.
Fix this by making bytes a basic_sstring<int8_t, ...> instead of using char.
int8_t is a 'signed char', which is a distinct type from char, so now
bytes_view is a distinct type from sstring_view.
uint8_t would have been an even better choice, but that diverges from Origin
and would have required an audit.
Inspired in Gleb's previous patch, this patch adds a hash and comparison
operator for dht::token.
The previous patch, however, had a number of problems. Comparisons were failing
in tokens that were verified (by me) to be equal to the ones Origin was
generating.
The main reasons for that, was that the byte-comparison loop must be unsigned,
not signed.
With the above change, the comparison function would always succeed *except*
when the integer version of _data was that of a signed one.
Looking at Origin, one verifies that the Murmur3Partitioner class overrides the
comparison functions, and just does a Long comparison with the token.
This patch implements a similar mechanism. With that, a list of tokens
generated by origin in ascending order is verified by us to also be in
ascending order.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
Holding keys and their prefixes as "bytes" is error prone. It's easy
to mix them up (or use wrong types). This change adds wrappers for
keys with accessors which are meant to make misuses as difficult as
possible.
Prefix and full keys are now distinguished. Places which assumed that
the representation is the same (it currently is) were changed not to
do so. This will allow us to introduce more compact storage for non-prefix
keys.
C++ doesn't define overflow on signed types, so use unsigned types instead.
Luckily all right shifts were unsigned anyway.
Some signed extension was happening (handling remainders after processing
8-byte chunks) but should still be there.
Caught by debug build.
We don't follow origin precisely in normalizing the token (converting a
zero to something else). We probably should, to allow direct import of
a database.
Rather than converting to unsigned longs for the fractional computations,
do them it bytes. The overhead of allocating longs will be larger than
the computation, given that tokens are usually short (8 bytes), and
our bytes type stores them inline.
Origin uses abstract types for Token; for two reasons:
1. To create a distinction between tokens for keys and tokens
that represent the end of the range
2. To use different implementations for tokens belonging to different
partitioners.
Using abstract types carries a penalty of indirection, more complex
memory management, and performance. We can eliminate it by using
a concrete type, and defer any differences in the implementation
to the partitioner. End-of-range token representation is folded into
the token class.