Since all the gossip callback (e.g., on_change) are executed inside a
seastar::async context, we can make wait for the operations like update
system table to complete.
Aside from being the obviously correct thing to do, not having this will force us
to manually adjust num_tokens when running our sstables into Cassandra.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
It needs to access the non-existent "DatabaseDescriptor". Do as we have been doing,
and just pass the database object instead.
Signed-off-by: Glauber Costa <glommer@cloudius-systems.com>
If the requested range wraps around the end of the ring, make_local_reader()
ends up trying to create an sstable reader with that wrapping range, which
is not supported. Also our shard-looping code is wrong in a wrapping range.
The solution is simple: split the wrap-around range into two (from the start
of the range until the end of the ring, and then from the beginning of the
ring until the end of the range), and read each of these subranges normally.
This feature is needed to allow streaming a range that wraps around, because
streaming currently uses make_local_reader().
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
make_local_reader takes a partition-range parameter, passed by *reference*.
The meaning of this passing-by-reference is not documented: when
make_local_reader returns, which of the following two statements holds?
1. make_local_reader still holds this reference, so the caller is
forced to keep the range object alive. For how long?
or
2. make_local_reader reads the range before returning, and when it
returns, the caller continues to "own" the range object, and is
allowed to do anything, including delete it.
The principle of least surprise suggests that #2 is better, unless
there is a very good reason to choose #1, and unless this requirement to
keep the argument alive (and for how long) was clearly documented.
But neither is the case here - the overhead of copying the argument is
negligable compared to the other overheads of make_local_reader, and
nothing was documented.
But it turns out the current code did #1 - the address of the range was
passed around *after* make_local_reader returns - the different readers
on different cpus continue to use it to eventually find the right sstable
byte range to iterate over. It very easy to forget this and call
make_local_reader on a on-stack range variable, and the result is a
hard-to-debug use-after-free mess.
This patch switches us to situation #2: Before make_local_reader returns,
the range is copied to whoever needs to hold it after the return, namely
each individual shard_reader. The patch to do this is trivial (one
character removed).
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
gcc 4.9 fails to compiles the following legal code (which compiles fine on
gcc 5.1):
#include <unordered_set>
std::unordered_set<int> hi() {
return {};
}
Work around this problem to make Scylla compile on gcc 4.9 again.
Note that this bug is specific to std::unordered_set - we have other places
in the code code which "return {}" for std::set, and those work fine.
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
It is needed for db.get_version(). I really hated to pass &db everywhere
If we had a global helper function like get_local_db(), life will be much
easier.
Use seastar::async to simplify the code. This code is only executed
during boot up, so it is fine to use seastar::async.
More commented out code are enabled.
There's nothing legacy about it so rename legacy_schema_tables to
schema_tables. The naming comes from a Cassandra 3.x development branch
which is not relevant for us in the near future.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Remove commented out isReadyForBoostrap. We don't have a StageManager
nor we will so drop the function.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Rename "MIGRATION_DELAY_IN_MSEC" to "migration_delay" as the unit of
time is already clear from the type.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
Use get_storage_proxy() and get_local_storage_proxy() helpers under the
hood to simplify migration manager API users.
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
"This series implements initial support for CQL events. We introduce
migration_listener hook in migration manager as well as event notifier
in the CQL server that's built on top of it to send out the events via
CQL binary protocol. We also wire up create keyspace events to the
system so subscribed clients are notified when a new keyspace is
created.
There's still more work to be done to support all the events. That
requires some work to restructure existing code so it's better to merge
this initial series now and avoid future code conflicts."
Assume we have 3 tokens,
{ee 36 d0 3e e8 6c 35 b1 , c5 5b 00 4a 1d 77 4e 50 , b9 b2 a1 0a 16 0d 76 8e }
With this
for (auto t : tokens) {
_token_metadata.update_normal_token(t, get_broadcast_address());
}
Only the last token is inserted.
With this
_token_metadata.update_normal_tokens(tokens, get_broadcast_address());
All 3 tokens are inserted correctly.
Instead of merging shard data using make_combined_reader(), take advantage
of the fact that shard data is disjoint, and use make_joining_reader().
This removes the need to sort the partitions as they are being read.