Commit Graph

140 Commits

Author SHA1 Message Date
Duarte Nunes
66f6a367a4 ring_position_range_sharder: Avoid copying eagerly
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20161104115632.15974-1-duarte@scylladb.com>
2016-11-13 11:42:23 +02:00
Avi Kivity
7202b94183 dht: introduce a sharder for vectors of partition ranges
Building on the single-range sharder, add a sharder for vectors of
partition ranges.  This helps with wrapped ranges, which are translated
into a vector containing two shards.
2016-11-03 19:10:20 +02:00
Avi Kivity
43a2380899 dht: add a generator for shard/range pairs
Divides a ring_position range into a sequence of shard/range pairs.  This
allows sequential iteration over shards in ring order.

The current multi-partition query executes on all shards in parallel, but
this is very wasteful, as most of the data will be thrown away if it is not
included in the page.  With the generator, we can switch to sequential
execution.
2016-11-03 19:10:17 +02:00
Avi Kivity
1f88d103a8 partitioner: add i_partitioner::token_for_next_shard()
When performing a range query, we want to iterate over shards, running the
query on each shard in order until the query range is exhausted or we have
the right number of rows.

To be able to do this, introduce token_for_next_shard(), which allows us
to determine the boundary between shards.

It is a sort-of inverse to shard_of(), in that

  shard_of(token_for_next_range(t)) == shard_of(t) + 1
2016-11-03 19:09:23 +02:00
Avi Kivity
6c45b0bae8 partitioner: make comparators public
The public comparison operators depend on global_partitioner(), and are
therefore less useful for tests.
2016-11-03 11:27:40 +02:00
Avi Kivity
6320181b97 partitioner: const correctness for comparators 2016-11-03 11:27:40 +02:00
Avi Kivity
470826d127 partitioner: change partitioners to have shard counts independent from smp::count
Useful for testing.
2016-11-03 11:27:40 +02:00
Avi Kivity
a35136533d Convert ring_position and token ranges to be nonwrapping
Wrapping ranges are a pain, so we are moving wrap handling to the edges.

Since cql can't generate wrapping ranges, this means thrift and the ring
maintenance code; also range->ring transformations need to merge the first
and last ranges.

Message-Id: <1478105905-31613-1-git-send-email-avi@scylladb.com>
2016-11-02 21:04:11 +02:00
Duarte Nunes
862f51cddf partitioner: Parse token from bytes
This patch adds the from_bytes() function to the i_partitioner class,
whose purpose is parse a particular token and explicitly handle the
case when the minimum token is specified.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-09-30 11:17:02 +00:00
Avi Kivity
4fcebd4ca6 random_partitioner: fix overflow in shard_of()
uint128_t will overflow if smp::count > 2.  Replace with a larger type.

Message-Id: <1471188765-30142-1-git-send-email-avi@scylladb.com>
2016-08-15 09:41:54 +03:00
Asias He
2f4cd86809 random_partitioner: Implement random_partitioner
Cassandra 1.x clusters often use RandomPartitioner. Supporting
RandomPartitioner will allow easier migration to Scylla

Tests are added to make sure scylla generates the same token as
Cassandra does for the same partition key.

Fixes #1438

Message-Id: <3bc8b7f06fad16d59aaaa96e2827198ce74214c6.1469166766.git.asias@scylladb.com>
2016-07-24 16:25:25 +03:00
Duarte Nunes
aaa76d58ba query: Move to_partition_range to dht namespace
This patch moves to_partition_range, from the query namespace
to the dht namespace, where it is a more natural fit.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1468498060-19251-1-git-send-email-duarte@scylladb.com>
2016-07-15 10:41:52 +02:00
Asias He
f4389349e4 config: Enable partitioner option
Enable --partitioner option so that user can choose partitioner other
than the default Murmur3Partitioner. Currently, only Murmur3Partitioner
and ByteOrderedPartitioner are supported. When non-supported partitioner
is specifed, error will be propogated to user.
2016-07-08 17:44:55 +08:00
Asias He
9c27b5c46e byte_ordered_partitioner: Implement missing describe_ownership and midpoint
In order to support ByteOrderedPartitioner, we need to implement the
missing describe_ownership and midpoint function in
byte_ordered_partitioner class.

As a starter, this path uses a simple node token distance based method
to calculate ownership. C* uses a complicated key samples based method.
We can switch to what C* does later.

Tests are added to tests/partitioner_test.cc.

Fixes #1378
2016-07-08 17:44:55 +08:00
Asias He
f6a2672be0 storage_service: Modify log to match config option of scylla
We currently log as follow:

May  9 00:09:13 node3.nl scylla[2546]:  [shard 0] storage_service - This
node was decommissioned and will not rejoin the ring unless
cassandra.override_decommission=true has been set,or all existing data
is removed and the node is bootstrapped again

Howerver, user should use

   override_decommission:true

instead of

   cassandra.override_decommission:true

in scylla.yaml where the cassandra prefix is stripped.

Fixes #1240
Message-Id: <b0c9424c6922431ad049ab49391771e07ca6fbde.1467079190.git.asias@scylladb.com>
2016-07-04 10:47:49 +02:00
Piotr Jastrzebski
27575a0528 Fix previous_entry_is_continuous
Rename it to check_previous_entry.
Remove unnesessary test.
Make sure ring_position always has working relation_to_keys method.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <6bc790d492ba9b5c302a50218f3e26b924f657d0.1467101754.git.piotr@scylladb.com>
2016-06-28 10:27:08 +02:00
Asias He
ee0585cee9 dht: Add default constructor for token
It is needed to put token in to a boost interval_map in the following
patch.
2016-05-17 17:32:15 +08:00
Pekka Enberg
38a54df863 Fix pre-ScyllaDB copyright statements
People keep tripping over the old copyrights and copy-pasting them to
new files. Search and replace "Cloudius Systems" with "ScyllaDB".

Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>
2016-04-08 08:12:47 +03:00
Gleb Natapov
775cc93880 remove unused range and token serializers 2016-02-02 12:15:49 +02:00
Asias He
bdd6a69af7 streaming: Drop unused parameters
- int connections_per_host

Scylla does not create connections per stream_session, instead it uses
rpc, thus connections_per_host is not relevant to scylla.

- bool keep_ss_table_level
- int repaired_at

Scylla does not stream sstable files. They are not relevant to scylla.
2016-01-25 11:38:13 +08:00
Gleb Natapov
043d132ba9 Remove no longer used serializers. 2016-01-24 12:45:41 +02:00
Gleb Natapov
49ce2b83df Add ring_position constructor needed by serializer. 2016-01-24 12:45:41 +02:00
Asias He
89b79d44de streaming: Get rid of the _connecting_ parameter
messaging_service will use private ip address automatically to connect a
peer node if possible. There is no need for the upper level like
streaming to worry about it. Drop it simplifies things a bit.
2015-12-31 11:25:08 +01:00
Nadav Har'El
f0b27671a2 murmur3 partitioner: remove outdated comment, and code
Since commit 16596385ee, long_token() is already checking
t.is_minimum(), so the comment which explains why it does not (for
performance) is no longer relevant. And we no longer need to check
t._kind before calling long_token (the check we do here is the same
as is_minimum).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2015-12-30 10:01:29 +02:00
Nadav Har'El
06ab43a7ee murmur3 partitioner: fix midpoint() algorithm
The midpoint() algorithm to find a token between two tokens doesn't
work correctly in case of wraparound. The code tried to handle this
case, but did it wrong. So this patch fixes the midpoint() algorithm,
and adds clearer comments about why the fixed algorithm is correct.

This patch also modifies two midpoint() tests in partitioner_test,
which were incorrect - they verified that midpoint() returns some expected
values, but expected values were wrong!

We also add to the test a more fundemental test of midpoint() correctness,
which doesn't check the midpoint against a known value (which is easy to
get wrong, like indeed happened); Rather we simply check that the midpoint
is really inside the range (according to the token ordering operator).
This simple test failed with the old implementation of midpoint() and
passes with the new one.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2015-12-24 17:19:49 +02:00
Pekka Enberg
e56bf8933f Improve not implemented errors
Print out the function name where we're throwing the exception from to
make it easier to debug such exceptions.
2015-12-18 10:51:37 +01:00
Tomasz Grabiec
a78f4656e8 Introduce ring_position_less_comparator 2015-12-15 18:00:55 +01:00
Asias He
0af7fb5509 range_streamer: Kill FIXME in use_strict_consistency for consistent_rangemovement 2015-11-30 09:15:42 +08:00
Asias He
f80e3d7859 range_streamer: Simplify multiple_map to map conversion in add_ranges 2015-11-30 09:15:42 +08:00
Asias He
21882f5122 range_streamer: Kill one leftover comment 2015-11-30 09:15:42 +08:00
Asias He
6b258f1247 range_streamer: Kill FIXME for is_replacing 2015-11-30 09:15:42 +08:00
Asias He
6aa5bfe59f range_streamer: Add virtual destructor to i_source_filter
Found by debug build

==10190==ERROR: AddressSanitizer: new-delete-type-mismatch on 0x602000084430 in thread T0:
  object passed to delete has wrong type:
  size of the allocated type:   16 bytes;
  size of the deallocated type: 8 bytes.
    #0 0x7fe244add512 in operator delete(void*, unsigned long) (/lib64/libasan.so.2+0x9a512)
    #1 0x3c674fe in std::default_delete<dht::range_streamer::i_source_filter>::operator()(dht::range_streamer::i_source_filter*)
       const /usr/include/c++/5.1.1/bits/unique_ptr.h:76
    #2 0x3c60584 in std::unique_ptr<dht::range_streamer::i_source_filter, std::default_delete<dht::range_streamer::i_source_filter> >::~unique_ptr()
       /usr/include/c++/5.1.1/bits/unique_ptr.h:236
    #3 0x3c7ac22 in void __gnu_cxx::new_allocator<std::unique_ptr<dht::range_streamer::i_source_filter,
       std::default_delete<dht::range_streamer::i_source_filter> > >::destroy<std::unique_ptr<dht::range_streamer::i_source_filter,
       std::default_delete<dht::range_streamer::i_source_filter> > >(std::unique_ptr<dht::range_streamer::i_source_filter,
       std::default_delete<dht::range_streamer::i_source_filter> >*) /usr/include/c++/5.1.1/ext/new_allocator.h:124
...
2015-11-12 11:19:22 +02:00
Asias He
87292d6a16 range_streamer: Simplify unordered_multimap_to_unordered_map
operator[] is own friend, it creates map[x] if x is not in the map.
2015-11-09 08:43:04 +08:00
Asias He
a54989cd65 range_streamer: Fix get_all_ranges_with_strict_sources_for
std::set_difference requires the container to be sorted which is not
true here, use remove_if.

Do not use assert, use throw instead so that we can recover from this
error.
2015-11-09 08:43:04 +08:00
Asias He
d166b0f3fa range_streamer: Add get_work_map 2015-11-09 08:43:04 +08:00
Asias He
ed313160c2 storage_service: Add initial_token config option support 2015-11-04 10:42:17 +08:00
Asias He
16596385ee token: Handle minimum token correctly in long_token
Fixes:

Exiting on unhandled exception of type 'runtime_exception': runtime
error: Invalid token. Should have size 8, has size 0
2015-11-04 09:01:06 +08:00
Amnon Heiman
b77ec2bd6a Importing token_range and endpoint_details from origin
The storage server uses the token_range in origin to return inforamtion
about the ring.

This import the structures. The functionality in origin is redundant in
this case and was not imported.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2015-11-03 10:17:05 +02:00
Asias He
6259dd73b6 token: Print token using the partitioner defined method
Make nodetool ring output align with c* output.
2015-10-29 15:53:47 +02:00
Asias He
1bbc1920d2 range_streamer: Start to use get_preferred_ip
It is available now.
2015-10-27 21:48:37 +08:00
Tomasz Grabiec
7b1e78ffcd storage_proxy: Fix make_local_reader() for ranges with min/max tokens
shard_of() was undefined for before_all/after_all tokens. Fix by
adding handling for these.
2015-10-25 10:27:58 +02:00
Asias He
b172146223 range_streamer: Introduce single_datacenter_filter 2015-10-23 16:13:30 +08:00
Asias He
98b34ecc67 boot_strapper: Fix use after free
streamer is a stack variable, it is gone when the function returns.
Fix it using a shared pointer.

Fixes #489
2015-10-23 16:13:30 +08:00
Paweł Dziepak
1f0cb9066b add key_reader interface
key_readers provide an interface analogous to mutation_readers, but the
only data they return are decorated keys.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2015-10-20 20:26:54 +02:00
Raphael S. Carvalho
936575efd2 dht: introduce comparator for token
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
2015-10-15 19:34:50 -03:00
Avi Kivity
c1cfec3e5a dht: mark boot_strapper logger static
Or it violates the ODR and causes link errors.
2015-10-13 16:42:42 +03:00
Asias He
cf9d9e2ced boot_strapper: Enable range_streamer code in bootstrap
Add code to actually stream data from other nodes during bootstrap.

I tested with the following:

   1) stat a node 1

   2) insert data into node 1

   3) start node 2

I can see from the logger that data is streamed correctly from node 1
to node 2.
2015-10-13 15:54:18 +08:00
Asias He
0d1e5c3961 boot_strapper: Add debug info for get_bootstrap_tokens
Print the token generated for the node who is bootstrapping.
2015-10-13 15:48:53 +08:00
Asias He
dc02d76aee range_streamer: Implement fetch_async
It is used by boot_strapper::bootstrap() in bootstrap process to start
the streaming.
2015-10-13 15:45:56 +08:00
Asias He
887c0a36ec range_streamer: Implement add_ranges
It is used by boot_strapper::bootstrap() in bootstrap process.
2015-10-13 15:45:56 +08:00