For stopping a task of compaction manager, we first close the gate
used by compaction then bust semaphore via semaphore::broken().
The problem is that semaphore::broken() only signals waiters, and so
subsequent semaphore::wait() calls would succeed and the task would
remain alive forever.
The fix is to signal semaphore, forcing the task to exit via gate
exception, so we will no longer rely on semaphore::broken() for
finishing the task. That's possible because we try to access the
gate right after we waited on semaphore.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
Add a abstract_replication_strategy::get_primary_ranges() method, which is
very similar to the existing get_ranges(), except that only the "primary"
owner of each range will return it in its list.
This is needed for the "primary range" repair option, which asks to repair
only the primary range. This option is useful when the user plans to start
a repair on *all* nodes, we shouldn't repair the same token range multiple
times, so each range should be repaired by only one of the nodes.
abstract_replication_strategy::get_primary_ranges() is similar to Origin's
StorageService.getPrimaryRangesForEndpoint().
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
There can be multiple sends underway when first one detects an error and
destroys rpc client, but the rpc client is still in use by other sends.
Fix this by making rpc client pointer shared and hold on it for each
send operation.
Origin forbdis empty values in clustering key only if that clustering
key is non-composite (i.e. there is only one column).
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
Requested by Avi. The added benefit is that the code for repairing
all the ranges in parallel is now identical to the code of repairing
the ranges one by one - just replace do_for_each with parallel_for_each,
and no need for a different implementation using semaphores like I had
before this patch.
Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
If we don't yield, we can run out of memory while moving a memtable into
cache.
This reduces the chance that writing an sstable will fail because we could
not transfer the memtable into the cache.
If sleep time isn't enough for compaction manager to select the
submitted cf for compaction, then the test will fail because the
compaction will not take place and subsequent checks will fail.
A solution is to sleep until the required condition becomes true.
Problem and solution found by Shlomi.
Signed-off-by: Raphael S. Carvalho <raphaelsc@cloudius-systems.com>
We are printing out error messages when a remote connection is closed
ERROR [shard 0] gossip - Fail to send GossipDigestACK2 to 127.0.0.2:0: rpc::closed_error (connection is closed)
ERROR [shard 0] gossip - Fail to handle GOSSIP_DIGEST_ACK: rpc::closed_error (connection is closed)
WARN [shard 0] unimplemented
this is causing issues with DTEST as it validates after finishing a run
that there are no ERRORs in the log
The rule is:
We can handle it correctly if error occurs -> log warn
We can not handle it correctly when error occurs -> log error
Fixes#144
any_cast<X> is supposed to return X, but boost 1.55's any_cast<X> returns
X&&. This means the lifetime-extending construct
auto&& x = boost::any_cast<X>(...);
will not work, because the result of the expression is an rvalue reference,
not a true temporary.
Fix by using a temporary, not a lifetime-extending reference.
Fixes#163.
* seastar 69edf16...2afc6c8 (3):
> Rebase dpdk to v2.1.0
> future: don't use get() in future_state::forward_to()
> future: add get_value(), and use it in then()