Commit Graph

6914 Commits

Author SHA1 Message Date
Glauber Costa
fcebf6f72d sstable tests: don't use set_generation method
There is no reason aside from testing for a table to just change its generation
number.

There will be, however, when we support loading new sstables. The method
however needs to be completely rewritten, so let's make sure the tests are not
using that.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-21 18:02:42 +02:00
Glauber Costa
f3bad2032d database: fix type for sstable generation.
Avoid using long for it, and let's use a fixed size instead.  Let's do signed
instead of unsigned to avoid upsetting any code that we may have converted.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-21 18:01:20 +02:00
Avi Kivity
16006949d0 logalloc: make migrator an object, not a function pointer
The migrator tells lsa how to move an object when it is compacted.
Currently it is a function pointer, which means we must know how to move
the object at compile time.  Making it an object allows us to build the
migration function at runtime, making it suitable for runtime-defined types
(such as tuples and user-defined types).

In the future, we may also store the size there for fixed-size types,
reducing lsa overhead.

C++ variable templates would have made this patch smaller, but unfortunately
they are only supported on gcc 5+.
2015-10-21 11:24:56 +02:00
Avi Kivity
e2cd40e3bc Merge "remove and decommission node support part 2" from Asias
"More preparatory patches for remove and decommission node support:

- stream hints and reanges
- unbootstrap
- replication finished notification"
2015-10-21 12:24:14 +03:00
Asias He
1271ad6894 storage_service: Implement send_replication_notification 2015-10-21 16:11:33 +08:00
Asias He
56e55cd272 storage_proxy: Register replication_finished verb handler 2015-10-21 16:11:33 +08:00
Asias He
ffce7a7af8 storage_service: Implement confirm_replication 2015-10-21 16:11:33 +08:00
Asias He
1965e8751b messaging_service: Add REPLICATION_FINISHED verb
It is used to send replication finished message by storage_service when
removing a node from a cluster.
2015-10-21 16:11:33 +08:00
Asias He
d903bd0dba storage_service: Implement on_leave_cluster in excise 2015-10-21 14:55:01 +08:00
Asias He
4f57f0cdae storage_service: Enable restore_replica_count in remove_node 2015-10-21 14:55:01 +08:00
Asias He
5b170d1ffe storage_service: Complete handle_state_removing
Implement #if 0'ed code.
2015-10-21 14:55:01 +08:00
Asias He
7d656fe127 storage_service: Enable add_leaving_endpoint in handle_state_leaving 2015-10-21 14:55:01 +08:00
Asias He
d3120b3c2f storage_service: Complete is_replacing logic in handle_state_normal 2015-10-21 14:55:01 +08:00
Asias He
434a9e211e storage_service: Kill one FIXME in handle_state_bootstrap
It is already fixed.
2015-10-21 14:55:01 +08:00
Asias He
6f8f4816a5 storage_service: Implement unbootstrap
All the missing functions for unbootstrap are ready, we can implement
unbootstrap now.
2015-10-21 14:55:01 +08:00
Asias He
8a9374b331 storage_service: Implement stream_hints
Needed by unbootstrap.
2015-10-21 14:55:01 +08:00
Asias He
3f5e9baa17 storage_service: Implement stream_ranges
Needed by unbootstrap.
2015-10-21 14:55:01 +08:00
Asias He
d1eaccd234 storage_service: Implement leave_ring
Needed by unbootstrap.
2015-10-21 14:55:01 +08:00
Asias He
2f86feb581 storage_service: Move send_replication_notification to source file 2015-10-21 14:55:01 +08:00
Avi Kivity
5453cfbab7 Merge "snapshots: take + clear" from Glauber
"This is the code for taking a snapshot, and clearing a snapshot."
2015-10-21 08:59:42 +03:00
Avi Kivity
cf734132e7 Merge "Flusing of CF:s without replay positions" from Calle
"Fixes: #469

We occasionally generate memtables that are not empty, yet have no
high replay_position set. (Typical case is CL replay, but apparently
there are others).

Moreover, we can do this repeatedly, and thus get caught in the flush
queue ordering restrictions.

Solve this by treating a flush without replay_position as a flush at the
highest running position, i.e. "last" in queue. Note that this will not
affect the actual flush operation, nor CL callbacks, only anyone waiting
for the operation(s) to complete.

To do this, the flush_queue had its restrictions eased, and some introspection
methods added."
2015-10-20 17:36:57 +03:00
Raphael S. Carvalho
93c1035f6e range: add comment explaining !start and !end in lambda
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2015-10-20 17:23:35 +03:00
Raphael S. Carvalho
c3a9d342f4 range: rename overlap to overlaps
overlaps() is more grammatical.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2015-10-20 17:23:35 +03:00
Avi Kivity
f4bd089b83 Merge "remove and decommission node" from Asias
"Preparatory patch for remove and decommission node support"
2015-10-20 17:00:04 +03:00
Glauber Costa
3a95a9cbe6 api: implement clear snapshot
Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-20 15:58:31 +02:00
Glauber Costa
19bb50f450 api: implement take_snapshot
Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-20 15:58:31 +02:00
Glauber Costa
21f84d77fc storage_service: delete a snapshot
This patch provides an storage service api to delete an snapshot.  Because all
keyspaces and CFs are visible in all shards. This will allow us to fetch the
list of keyspaces in the present shard and issue the filesystem operations in
that same shard.

That simplifies the code tremendously, and because there are not any operations
we need to do previous to the fs ones (like in the case of create snapshot), we
need no synchronization. Even easier.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-20 15:58:31 +02:00
Glauber Costa
2f2a4e83e0 storage_service: take a snapshot of a particular column family
Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-20 15:58:30 +02:00
Glauber Costa
fe3164714f storage_service: take a snapshot of a group of keyspaces
Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-20 15:58:30 +02:00
Glauber Costa
d236b01b48 snapshots: check existence of snapshots
We go to the filesystem to check if the snapshot exists. This should make us
robust against deletions of existing snapshots from the filesystem.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-20 15:58:26 +02:00
Glauber Costa
d3aef2c1a5 database: support clear snapshot
This allows for us to delete an existing snapshot. It works at the column
family level, and removing it from the list of keyspace snapshots needs to
happen only when all CFs are processed. Therefore, that is provided as a
separate operation.

The filesystem code is a bit ugly: it can be made better by making our file
lister more generic. First step would be to call it walker, not lister...

For now, we'll use the fact that there are mostly two levels in the snapshot
hierarchy to our advantage, and avoid a full recursion - using the same lambda
for all calls would require us to provide a separate class to handle the state,
that's part of making this generic.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-20 15:38:14 +02:00
Glauber Costa
500ee99c93 file lister: allow for more than one directory type
There are situations in which we would like to match more than one directory
type.  One example of that, would be a recursive delete operation: we need to
delete the files inside directories and the directories themselves, but we
still don't want a "delete all" since finding anything other than a directory
or a file is an error, and we should treat it as such.

Since there aren't that many times, it should be ok performance wise to just
use a list. I am using an unordered_set here just because it is easy enough,
but we could actually relax it later if needed. In any case, users of the
interface should not worry about that, and that decision is abstracted away
into lister::dir_entry_types.

Signed-off-by: Glauber Costa <glommer@scylladb.com>
2015-10-20 15:38:14 +02:00
Asias He
9e5ee17f4a storage_service: Implement rebuild 2015-10-20 21:32:30 +08:00
Asias He
31b50a83d1 storage_service: Implement excise 2015-10-20 21:32:30 +08:00
Asias He
0cf112501e storage_service: Implement restore_replica_count 2015-10-20 21:32:30 +08:00
Asias He
0ebcb1ddef storage_service: Stub send_replication_notification
Needed by restore_replica_count.
2015-10-20 21:32:30 +08:00
Asias He
1c480554eb storage_service: Stub get_new_source_ranges
Needed by restore_replica_count.
2015-10-20 21:32:30 +08:00
Asias He
955e766a49 storage_service: Partially implement decommission 2015-10-20 21:32:30 +08:00
Asias He
893849f8af storage_service: Stub unbootstrap 2015-10-20 21:32:30 +08:00
Asias He
9ebae12614 storage_service: Partially implement remove_node
restoreReplicaCount and excise are missing.
2015-10-20 21:32:30 +08:00
Asias He
142f29483a token_metadata: Implement add_leaving_endpoint 2015-10-20 21:32:30 +08:00
Asias He
f1bc882b90 storage_service: Implement get_changed_ranges_for_leaving 2015-10-20 21:32:30 +08:00
Asias He
937474bf14 abstract_replication_strategy: Make calculate_natural_endpoints public
It is used by storage_service.
2015-10-20 21:32:30 +08:00
Asias He
c5e35ac57e storage_service: Enable _replicating_nodes and _removing_node members 2015-10-20 21:32:30 +08:00
Asias He
f30fbd53ff storage_service: Start to use pending_range_calculator_service 2015-10-20 21:32:30 +08:00
Asias He
934c963d85 init: Init pending_range_calculator_service 2015-10-20 21:32:29 +08:00
Asias He
8d6200c036 service: Convert PendingRangeCalculatorService.java to C++ 2015-10-20 21:32:29 +08:00
Asias He
a5d91519f2 service: Import PendingRangeCalculatorService.java 2015-10-20 21:32:29 +08:00
Asias He
c96bc8bbd2 token_metadata: Implement calculate_pending_ranges 2015-10-20 21:32:14 +08:00
Asias He
a6065397d9 token_metadata: Implement clone_after_all_left 2015-10-20 20:38:43 +08:00