This patch provides an storage service api to delete an snapshot. Because all
keyspaces and CFs are visible in all shards. This will allow us to fetch the
list of keyspaces in the present shard and issue the filesystem operations in
that same shard.
That simplifies the code tremendously, and because there are not any operations
we need to do previous to the fs ones (like in the case of create snapshot), we
need no synchronization. Even easier.
Signed-off-by: Glauber Costa <glommer@scylladb.com>
We go to the filesystem to check if the snapshot exists. This should make us
robust against deletions of existing snapshots from the filesystem.
Signed-off-by: Glauber Costa <glommer@scylladb.com>
This allows for us to delete an existing snapshot. It works at the column
family level, and removing it from the list of keyspace snapshots needs to
happen only when all CFs are processed. Therefore, that is provided as a
separate operation.
The filesystem code is a bit ugly: it can be made better by making our file
lister more generic. First step would be to call it walker, not lister...
For now, we'll use the fact that there are mostly two levels in the snapshot
hierarchy to our advantage, and avoid a full recursion - using the same lambda
for all calls would require us to provide a separate class to handle the state,
that's part of making this generic.
Signed-off-by: Glauber Costa <glommer@scylladb.com>
There are situations in which we would like to match more than one directory
type. One example of that, would be a recursive delete operation: we need to
delete the files inside directories and the directories themselves, but we
still don't want a "delete all" since finding anything other than a directory
or a file is an error, and we should treat it as such.
Since there aren't that many times, it should be ok performance wise to just
use a list. I am using an unordered_set here just because it is easy enough,
but we could actually relax it later if needed. In any case, users of the
interface should not worry about that, and that decision is abstracted away
into lister::dir_entry_types.
Signed-off-by: Glauber Costa <glommer@scylladb.com>
This is certainly the right thing to do and seems to fix#403. However
I didn't manage to convince myself that this would cause problems for
binomial_heap, given that binomial_heap::erase() calls siftup()
anyway:
void erase(handle_type handle)
{
node_pointer n = handle.node_;
siftup(n, force_inf());
top_element = n;
pop();
}
void increase (handle_type handle)
{
node_pointer n = handle.node_;
siftup(n, *this);
update_top_element();
sanity_check();
}
There was a confusion between the snapshot key and the keyspace in the
snapshot details, this fixes it.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
"Those are fixes needed for the snapshotting process itself. I have bundled this
in the create_snapshot series before to avoid a rebase, but since I will have to
rewrite that to get rid of the snapshot manager (and go to the filesystem),
I am sending those out on their own."
This adds a workaround for the get schema_version, it will return only a
shcema version of the local node, this is a temporary workaround until
describe_schema_versions will be implemented.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This adds the implementation for the get cluster name and get
partitioner name to the storage_service API.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
The API can call any of the gossiper shareds to get the cluster name, so
the initilization needs to set it in all of them.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
We occasionally generate memtables that are not empty, yet have no
high replay_position set. (Typical case is CL replay, but apparently
there are others).
Moreover, we can do this repeatedly, and thus get caught in the flush
queue ordering restrictions.
Solve this by treating a flush without replay_position as a flush at the
highest running position, i.e. "last" in queue. Note that this will not
affect the actual flush operation, nor CL callbacks, only anyone waiting
for the operation(s) to complete.
As long as we guarantee that the execution order for the post ops are
upheld, we can allow insertion of multiple ops on the same key.
Implemented by adding a ref count to each position.
The restriction then becomes that an added key must either be larger
than any already existing key, _OR_ already exist. In the latter case,
we still know that we have not finished this position and signaled
"upwards".
test_setup::do_with_test_directory is missing. For some reason,
the test wasn't failing without it until now. Adding it is the
correct thing to do anyway.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
"This patchset introduces leveled compaction to Scylla.
We don't handle all corner cases yet, but we already have the strategy
and compaction working as expected. Test cases were written and I also
tested the stability with a load of cassandra-stress.
Leveled compaction may output more than one sstable because there is
a limit on the size of sstables. 160M by default.
Related to handling of partial compaction, it's still something to be
worked on.
Anyway, it will not be a big problem. Why? Suppose that a leveled
compaction will generate 2 sstables, and scylla is interrupted after
the first sstable is completely written but before the second one is
completely written. The next boot will delete the second sstable,
because it was partially written, but will not do anything with the
first one as it was completely written.
As a result, we will have two sstables with redundant data."
This patch adds the ability to set one or all log levels get a log level
and get all logs name.
After this patch the following url will be available:
GET/POST
/system/logger
/system/logger/{name}
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
The system api will include system related command, currently it holds
the logger related API. It holds definition for the following commands:
get_all_logger_names
set_all_logger_level
get_logger_level
set_logger_level
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
This is a helper function that returns a log level name. It will be used
by the API to report the log levels.
Signed-off-by: Amnon Heiman <amnon@cloudius-systems.com>
With the distribute-and-sync method we are using, if an exception happens in
the snapshot creation for any reason (think file permissions, etc), that will
just hang the server since our shard won't do the necessary work to
synchronize and note that we done our part (or tried to) in snapshot creation.
Make the then clause a finally, so that the sync part is always executed.
Signed-off-by: Glauber Costa <glommer@scylladb.com>
create_links will fail in one of the shards if one of the SSTables happen to be
shared. It should be fine if the link already exists, so let's just ignore that case.
Signed-off-by: Glauber Costa <glommer@scylladb.com>