Commit Graph

96 Commits

Author SHA1 Message Date
Raphael S. Carvalho
936575efd2 dht: introduce comparator for token
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-10-15 19:34:50 -03:00
Avi Kivity
c1cfec3e5a dht: mark boot_strapper logger static
Or it violates the ODR and causes link errors.
2015-10-13 16:42:42 +03:00
Asias He
cf9d9e2ced boot_strapper: Enable range_streamer code in bootstrap
Add code to actually stream data from other nodes during bootstrap.

I tested with the following:

   1) stat a node 1

   2) insert data into node 1

   3) start node 2

I can see from the logger that data is streamed correctly from node 1
to node 2.
2015-10-13 15:54:18 +08:00
Asias He
0d1e5c3961 boot_strapper: Add debug info for get_bootstrap_tokens
Print the token generated for the node who is bootstrapping.
2015-10-13 15:48:53 +08:00
Asias He
dc02d76aee range_streamer: Implement fetch_async
It is used by boot_strapper::bootstrap() in bootstrap process to start
the streaming.
2015-10-13 15:45:56 +08:00
Asias He
887c0a36ec range_streamer: Implement add_ranges
It is used by boot_strapper::bootstrap() in bootstrap process.
2015-10-13 15:45:56 +08:00
Asias He
1c1f9bed09 range_streamer: Implement use_strict_sources_for_ranges 2015-10-13 15:45:55 +08:00
Asias He
43d4e62b5a range_streamer: Implement add_source_filter 2015-10-13 15:45:55 +08:00
Asias He
d47ea88aa8 range_streamer: Implement get_all_ranges_with_strict_sources_for 2015-10-13 15:45:55 +08:00
Asias He
84de936e43 range_streamer: Implement get_all_ranges_with_sources_for 2015-10-13 15:45:55 +08:00
Asias He
944e28cd6c range_streamer: Implement get_range_fetch_map 2015-10-13 15:45:55 +08:00
Asias He
d986a4d875 range_streamer: Add constructor 2015-10-13 15:45:55 +08:00
Asias He
1d6c081766 range_streamer: Add i_source_filter and failure_detector_source_filter
It is used to filter out unwanted node.
2015-10-13 15:45:55 +08:00
Asias He
c8b9a6fa06 dht: Convert RangeStreamer to C++ 2015-10-13 15:45:55 +08:00
Asias He
b95521194e dht: Import dht/RangeStreamer 2015-10-13 15:45:55 +08:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Amnon Heiman
5e03524d9d murmur3_partitioner: add describe_ownership from origin
The code was taken from origin, where instead of BigInteger, uint64_t was used.

The function returns the part that each token is responsible for, the sum of
all is raughly 1.

Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
2015-08-25 19:39:08 +03:00
Asias He
126fc5869c dht/boot_strapper: Move code to source file
get_bootstrap_tokens and get_random_tokens are moved.
2015-08-24 18:54:42 +08:00
Asias He
26861ddc29 dht/boot_strapper: Use unordered_set for tokens
unordered_set is used everywhere for tokens. This makes it is easier to
construct a boot_strapper object in storage_service::bootstrap where
unordered_set is used for tokens.
2015-08-24 18:54:42 +08:00
Asias He
2ebd08cb42 dht/boot_strapper: Partially implement bootstrap 2015-08-24 18:54:42 +08:00
Glauber Costa
229ce6cd85 dht: provide a from_sstring method
Only the partitioner knows how to convert a token to a sstring. Conversely,
only the partitioner can know how to convert it back.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-17 11:03:35 -07:00
Glauber Costa
5f807784bf dht: fix to_sstring methods to account for min tokens
Right now, we are converting the _data part of the token to a sstring, which
may be latter stored somewhere - in a system sstable, for instance. Later on,
we will have to get it back, but the way the code currently stands, we will get
undefined results for min and max tokens, since they have the _data field
empty.

For murmur3, strictly speaking, the correct solution would be to change
long_token to account for that. However, when we compare values, we already do
kind comparations explicitly. Inserting them there would only make that
operation branchier == costlier, which being a very common one, we don't want
to.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-17 10:23:19 -07:00
Glauber Costa
6fcbb3570e murmur3 partitioner: explicitly use int64_t instead of long
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-17 10:19:52 -07:00
Paweł Dziepak
d06e450616 dht: add i_paritioner::token_to_bytes()
This allows token::_data to be in a different representation
than the one expected by the token type.

Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-08-14 16:12:30 +02:00
Paweł Dziepak
faa588cb0a dht: murmur3_paritioner: implement get_token_validator()
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-08-14 15:58:39 +02:00
Glauber Costa
e1968c389e dht: use tri_compare for token comparisons
Loading data from memory tends to be the most expensive part of the comparison
operations. Because we don't have a tri_compare function for tokens, we end up
having to do an equality test, which will load the token's data in memory, and
then, because all we know is that they are not equal, we need to do another
one.

Having two dereferences is harmful, and shows up in my simple benchmark. This
is because before writing to sstables, we must order the keys in decorated key
order, which is heavy on the comparisons.

The proposed change speeds up index write benchmark by 8.6%:

Before:
41458.14 +- 1.49 partitions / sec (30 runs)

After:
45020.81 +- 3.60 partitions / sec (30 runs)

Parameters:
--smp 6 --partitions 500000

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-12 09:23:42 -05:00
Glauber Costa
3426b3ecc1 bootstrap tokens: get tokens from config file
Aside from being the obviously correct thing to do, not having this will force us
to manually adjust num_tokens when running our sstables into Cassandra.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 11:10:56 -05:00
Glauber Costa
2678b0e606 dht: change get_bootstrap_tokens()'s signature
It needs to access the non-existent "DatabaseDescriptor". Do as we have been doing,
and just pass the database object instead.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-08-07 11:10:56 -05:00
Raphael S. Carvalho
915b05c956 dht: fix load of misaligned address
Error reported by debug mode when running sstable test
Solution is to use unaligned cast.

dht/murmur3_partitioner.cc:67:25: runtime error: load of misaligned
address 0x6030000478fc for type 'const long int', which requires 8
byte alignment

Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-08-06 18:28:38 +03:00
Tomasz Grabiec
c4acdb2068 db: Switch from bytes to managed_bytes for storing data
We need a container which can be used with compacting
allocators. "bytes" can't be used with compacting allocator because it
can't handle its external storage being moved.
2015-08-06 14:05:16 +02:00
Tomasz Grabiec
593fa725e9 dht: Relax dependencies on bytes const&
In preparation to switching to a bytes container which is not "bytes"
switch to bytes_view.
2015-08-06 14:05:16 +02:00
Avi Kivity
f915ff1fcd dht: introduce i_partitioner::shard_of() and implement msb sharding
Make sharding partitioner-specific, since different partitioners interpret
the byte content differently.

Implement it by extracting the shard from the most significant bits, which
can be used to minimize cross shard traffic for range queries, and reduces
sstable sharing.
2015-08-03 20:17:40 +03:00
Tomasz Grabiec
e5feff5d71 dht: ring_position: Switch to total ordering
range::is_wrap_around() and range::contains() rely on total ordering
on values to work properly. Current ring_position_comparator was only
imposing a weak ordering (token positions equal to all key positions
with that token).

range::before() and range::after() can't work for weak ordering. If
the bound is exclusive, we don't know if user-provided token position
is inside or outside.

Also, is_wrap_around() can't properly detect wrap around in all
cases. Consider this case:

 (1) ]A; B]
 (2) [A; B]

For A = (tok1) and B = (tok1, key1), (1) is a wrap around and (2) is
not. Without total ordering between A and B, range::is_wrap_around() can't
tell that.

I think the simplest soution is to define a total ordering on
ring_position by making token positions positioned either before or
after all keys with that token.
2015-07-24 16:08:41 +02:00
Tomasz Grabiec
2e845140e2 dht: ring_position: Implement less_compare() and equals() using tri_compare() 2015-07-24 16:08:41 +02:00
Tomasz Grabiec
7b30e8fcff dht: ring_position: Move definitions out of line 2015-07-24 16:08:41 +02:00
Asias He
243dbd3bfd dht: Reuse token serializer in ring_position 2015-07-23 09:08:15 +08:00
Asias He
1761be4dc4 dht: Implement serializer interface for token
It is needed by query::range<token>.
2015-07-23 09:08:15 +08:00
Tomasz Grabiec
6373987704 partition_range: Introduce is_wrap_around() helper 2015-07-22 13:13:38 +02:00
Tomasz Grabiec
cce2c648c5 Merge branch 'dev/gleb/for_urchin' from seastar-dev.git
Beginning of read implementation for range queries from Gleb.
2015-07-15 12:53:43 +03:00
Gleb Natapov
253ba71747 add comparison functions for ring_position
Two ring_positions are equal if tokens and keys are equal or tokens are
equal and one or both of them do not specify key. So ring_positions
without a key is a wildcard that equals any ring_positions with the same
token.
2015-07-15 12:41:31 +03:00
Tomasz Grabiec
0a4651cd28 dht: Add field getters to decorated_key 2015-07-14 19:57:37 +02:00
Gleb Natapov
49fb10d640 improve token printout
Add min/max token printout
2015-07-14 12:21:42 +03:00
Paweł Dziepak
351b113913 dht: allow configuration file to choose partitioner
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-07-12 15:14:53 +02:00
Paweł Dziepak
ede9886f50 dht: add byte_ordered_partitioner
Some of the tests in DTEST take advantage of the fact that
ByteOrderedPartitioner guarantees certain ordering of partition keys.

Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-07-09 23:43:16 +02:00
Avi Kivity
dd29ac9593 Merge "cqlsh" from Glauber
System table Work to make cqlsh connect.
2015-07-07 19:33:23 +03:00
Glauber Costa
bd13e3995b dht: implement method to convert a token to a string
We need to be able to do it so we can, among other things, create CQL
statements that include the current state of the tokens.

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-07 11:38:22 -04:00
Glauber Costa
45905ec94d dht: change partitioner name to sstring
It is a better fit for things that are names, not blobs. We have a user that expects
a bytes parameter, but that is for no other reason than the fact that the field used
to be of bytes type.

Let's fix that, and future users will be able to use sstrings

Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
2015-07-07 11:38:22 -04:00
Tomasz Grabiec
d035c499b8 db: Move database::shard_of() to dht::shard_of() 2015-07-07 16:56:25 +02:00
Gleb Natapov
730170ff1a serialize data structures needed for read clustering 2015-07-01 13:36:28 +03:00
Tomasz Grabiec
ac333f04e2 dht: Make decorated_key comparable with ring_position 2015-06-25 18:45:12 +02:00