Commit Graph

6730 Commits

Author SHA1 Message Date
Asias He
d47ea88aa8 range_streamer: Implement get_all_ranges_with_strict_sources_for 2015-10-13 15:45:55 +08:00
Asias He
84de936e43 range_streamer: Implement get_all_ranges_with_sources_for 2015-10-13 15:45:55 +08:00
Asias He
944e28cd6c range_streamer: Implement get_range_fetch_map 2015-10-13 15:45:55 +08:00
Asias He
d986a4d875 range_streamer: Add constructor 2015-10-13 15:45:55 +08:00
Asias He
1d6c081766 range_streamer: Add i_source_filter and failure_detector_source_filter
It is used to filter out unwanted node.
2015-10-13 15:45:55 +08:00
Asias He
c8b9a6fa06 dht: Convert RangeStreamer to C++ 2015-10-13 15:45:55 +08:00
Asias He
b95521194e dht: Import dht/RangeStreamer 2015-10-13 15:45:55 +08:00
Asias He
b1c92f377d token_metadata: Add get_pending_ranges
One version returns only the ranges
   std::vector<range<token>>

Another version returns a map
   std::unordered_map<range<token>, std::unordered_set<inet_address>>
which is converted from
   std::unordered_multimap<range<token>, inet_address>

They are needed by token_metadata::pending_endpoints_for,
storage_service::get_all_ranges_with_strict_sources_for and
storage_service::decommission.
2015-10-13 15:45:55 +08:00
Asias He
b860f6a393 token_metadata: Add get_pending_ranges_mm
Helper for get_pending_ranges.
2015-10-13 15:45:55 +08:00
Asias He
c96d826fe0 token_metadata: Introduce _pending_ranges member 2015-10-13 15:45:55 +08:00
Asias He
da072b8814 token_metadata: Remove duplicated sortedTokens
It is implemented already.
2015-10-13 15:45:55 +08:00
Asias He
d820c83141 locator: Add abstract_replication_strategy::get_pending_address_ranges
Given the current token_metadata and the new token which will be
inserted into the ring after bootstrap, calculate the ranges this new
node will be responsible for.

This is needed by boot_strapper::bootstrap().
2015-10-13 15:45:55 +08:00
Asias He
1adb27e283 token_metadata: Add clone_only_token_map
Needed by get_pending_address_ranges.
2015-10-13 15:45:55 +08:00
Asias He
527edd69ae locator: Add abstract_replication_strategy::get_range_addresses
Needed by range_streamer::get_all_ranges_with_sources_for.
2015-10-13 15:45:55 +08:00
Asias He
044dcf43de locator: Add abstract_replication_strategy::get_address_ranges
Needed by get_pending_address_ranges.
2015-10-13 15:45:55 +08:00
Asias He
3d0d02816d token_metadata: Add get_primary_ranges_for and get_primary_range_for
Given tokens, return ranges the tokens present. For example, with t1 and
t2, it returns ranges:

(token before t1, t1]
(token before t2, t2]
2015-10-13 15:45:55 +08:00
Asias He
542b1394d7 token_metadata: Add get_predecessor
It is used to get the previous token of this token in the ring.
2015-10-13 15:45:55 +08:00
Asias He
ddfd417c13 locator: Make calculate_natural_endpoints take extra token_metadata parameter
When adding/removing a node, we need to use a temporary token_metadata
with pending tokens.
2015-10-13 15:45:55 +08:00
Asias He
7959c12073 stream_session: Support column_families is empty case
An empty column_families means to get all the column families.
2015-10-13 15:44:59 +08:00
Tomasz Grabiec
a383f91b68 range: Implement range::contains() which takes another range 2015-10-13 15:44:36 +08:00
Avi Kivity
0498cebc58 Merge seastar upstream
* seastar c2e86d5...78e3924 (2):
  > fix output stream batching
  > rpc: server connection shutdown fix

Adjust transport/server.cc for the demise of output_stream::batch_flush()
2015-10-12 14:00:40 +03:00
Avi Kivity
b8c8473505 Merge seastar upstream
* seastar 1995676...c2e86d5 (3):
  > doc: add Seastar tutorial
  > resource: increase default reserve memory
  > http client: moved http_response_parser.rl from apps/seawreck into http directory
2015-10-12 10:29:10 +03:00
Amnon Heiman
6fd3c81db5 keyspace clean up should be a POST not a GET 2015-10-11 15:51:56 +03:00
Avi Kivity
e252475e67 Merge "locator: Adding EC2Snitch" from Vlad
"This series adds EC2Snich.

Since both GossipingPropertyFileSnitch and EC2SnitchXXX snitches family
are using the same property file it was logical to share the corresponding
code. Most of this series does just that... "
2015-10-11 14:55:26 +03:00
Glauber Costa
f03480c054 avoid exception when processing caching_options
While trying to debug an unrelated bug, I was annoyed by the fact that parsing
caching options keep throwing exceptions all the time. Those exceptions have no
reason to happen: we try to convert the value to a number, and if we fail we
fall back to one of the two blessed strings.

We could just as easily just test for those strings beforehand and avoid all of
that.

While we're on it, the exception message should show the value of "r", not "k".

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-11 14:53:55 +03:00
Avi Kivity
2b87c8c372 build: allow defaulting test binaries not to strip debug information
Useful for continuous integration, which has enough disk space.
2015-10-11 12:25:20 +03:00
Glauber Costa
b2fef14ada do not calculate truncation time independently
Currently, we are calculating truncated_at during truncate() independently for
each shard. It will work if we're lucky, but it is fairly easy to trigger cases
in which each shard will end up with a slightly different time.

The main problem here, is that this time is used as the snapshot name when auto
snapshots are enabled. Previous to my last fixes, this would just generate two
separate directories in this case, which is wrong but not severe.

But after the fix, this means that both shards will wait for one another to
synchronize and this will hang the database.

Fix this by making sure that the truncation time is calculated before
invoke_on_all in all needed places.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-09 17:17:11 +03:00
Vlad Zolotarov
c30c1bb1ec tests: added ec2_snitch_test
Checks the following:
   - That EC2Snich is able to receive the availability zone from EC2.
   - That the resulting DC and RACK values are distributed among all
     shards.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-10-08 20:57:20 +03:00
Vlad Zolotarov
38f77bbfe5 locator: add ec2_snitch
This snitch will read the EC2 availability zone and set the DC
and RACK as follows:

If availability zone is "us-east-1d", then
DC="us-east" and  RACK="1d".

If cassandra-rackdc.properties contains "dc_suffix" field then
DC will be appended with its value.

For instance if dc_suffix=_1_cassandra, then in the example above

DC=us-east_1_cassandra

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-10-08 20:57:20 +03:00
Vlad Zolotarov
1d63ec4143 conf: added cassandra-rackdc.properties
This is a configuration file used by GossipingPropertyFileSnitch and
EC2SnitchXXX snitches family.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-10-08 20:57:20 +03:00
Vlad Zolotarov
afd44a6e08 locator::gossiping_property_file_snitch: initialize i_endpoint_snitch::io_cpu_id() in the constructor
This is just cleaner.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-10-08 20:57:19 +03:00
Vlad Zolotarov
2febae90c9 locator::production_snitch_base: unify property file parsing facilities
- Move property file parsing code into production_snitch_base class.
   - Make a parsing code more general:
      - Save the parsed keys in the hash table.
      - Check only two types of errors:
         - Repeating keys.
         - Add a set of all supported keys and add the check for a key
           being supported.
   - Added production_snitch_base.cc file.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-10-08 20:57:19 +03:00
Vlad Zolotarov
2db219f074 locator::gossiping_property_file_snitch: remove extra "namespace locator"
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-10-08 20:57:19 +03:00
Vlad Zolotarov
0cfdca55f3 locator::gossiping_property_file_snitch: make get_name() public as it should be
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-10-08 20:57:19 +03:00
Vlad Zolotarov
ba68436f2f locator::gossiping_property_file_snitch: get rid of warn() and err() wrappers
Use logger() accessor instead for a better resemblance with the Origin.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-10-08 20:57:19 +03:00
Vlad Zolotarov
d196b034e2 locator::snitch_base: Add a default snitch_base::stop() method
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-10-08 20:57:19 +03:00
Vlad Zolotarov
de6cf8db51 db::config: add get_conf_dir()
This function returns the directory containing the configuration
files. It takes into an account the evironment variables as follows:
   - If SCYLLA_CONF is defines - this is the directory
   - else if SCYLLA_HOME is defines, then $SCYLLA_HOME/conf is the directory
   - else "conf" is a directory, namely the configuration files should be
     looked at ./conf

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>

New in v2:
   - Updated get_conf_dir() description.
2015-10-08 20:57:11 +03:00
Avi Kivity
0c8f906b2d Merge "Fixes for snapshots" from Glauber 2015-10-08 18:21:54 +03:00
Glauber Costa
1549a43823 snapshots: fix json type
We are generating a general object ({}), whereas Cassandra 2.1.x generates an
array ([]). Let's do that as well to avoid surprising parsers.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-08 16:54:51 +02:00
Glauber Costa
cc343eb928 snapshots: handle jsondir creation for empty files case
We still need to write a manifest when there are no files in the snapshot.
But because we have never reached the touch_directory part in the sstables
loop for that case, nobody would have created jsondir in that case.

Since now all the file handling is done in the seal_snapshot phase, we should
just make sure the directory exists before initiating any other disk activity.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-08 16:54:51 +02:00
Glauber Costa
efdfc78c0c snapshots: get rid of empty tables optimization
We currently have one optimization that returns early when there are no tables
to be snapshotted.

However, because of the way we are writing the manifest now, this will cause
the shard that happens to have tables to be waiting forever. So we should get
rid of it. All shards need to pass through the synchronization point.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-08 16:54:51 +02:00
Glauber Costa
0776ca1c52 snapshots: don't hash pending snapshots by snapshot name
If we are hashing more than one CF, the snapshot themselves will all have the same name.
This will cause the files from one of them to spill towards the other when writing the manifest.

The proper hash is the jsondir: that one is unique per manifest file.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-08 16:54:51 +02:00
Avi Kivity
e7f58491c3 version: mark master branch as development version 2015-10-08 15:31:50 +03:00
Pekka Enberg
95012793e5 db/schema_tables: Wire up drop keyspace notifications
Signed-off-by: Pekka Enberg <penberg@scylladb.com>
2015-10-08 13:10:48 +02:00
Pekka Enberg
87d45cc58a service/migration_manager: Simplify notify_drop_keyspace()
There's no need to pass keyspace_metadata to notify_drop_keyspace()
because all we are interested in is the name. The keyspace has been
dropped so there's not much we could do with its metadata either.

Simplifies the next patch that wires up drop keyspace notification.

Signed-off-by: Pekka Enberg <penberg@scylladb.com>
2015-10-08 13:10:48 +02:00
Avi Kivity
e5dca96af3 Merge "snapshots: fix global generation of the manifest file" from Glauber
"snapshotting the files themselves is easy: if more than one CF happens to link
an SSTable twice, all but one will fail, and we will end up with one copy.

The problem for us, is that the snapshot procedure is supposed to leave a
manifest file inside its directory.  So if we just call snapshot() from
multiple shards, only the last one will succeed, writing its own SSTables to
the manifest leaving all other shards' SSTables unaccounted for.

Moreover, for things like drop table, the operation should only proceed when
the snapshot is complete. That includes the manifest file being correctly
written, and for this reason we need to wait for all shards to finish their
snapshotting before we can move on."
2015-10-08 13:08:31 +03:00
Glauber Costa
725ae03772 snapshots: write the manifest file from a single shard
Currently, the snapshot code has all shards writing the manifest file. This is
wrong, because all previous writes to the last will be overwritten. This patch
fixes it, by synchronizing all writes and leaving just one of the shards with the
task of closing the manifest.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-08 11:36:36 +02:00
Glauber Costa
25d24222fe snapshots: separate manifest creation
The way manifest creation is currently done is wrong: instead of a final
manifest containing all files from all shards, the current code writes a
manifest containing just the files from the shard that happens to be the
unlucky loser of the writing race.

In preparation to fix that, separate the manifest creation code from the rest.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-08 11:36:36 +02:00
Glauber Costa
abc63e4669 snapshots: clarify and fix sync behavior
We do need to sync jsondir after we write the manifest file (previously done,
but with a question), and before we start it (not previously done) to guarantee
that the manifest file won't reference any file that is not visible yet.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-08 11:36:36 +02:00
Glauber Costa
ca4babdb57 snapshots: close file after flush
We are currently flushing it, but not closing it.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-08 11:36:36 +02:00