Wrapping ranges are a pain, so we are moving wrap handling to the edges.
Since cql can't generate wrapping ranges, this means thrift and the ring
maintenance code; also range->ring transformations need to merge the first
and last ranges.
Message-Id: <1478105905-31613-1-git-send-email-avi@scylladb.com>
Change abstract_replication_strategy::create_replication_strategy() to
throw exceptions::configuration_error if replication strategy class
lookup to make sure the error is converted to the correct CQL response.
Fixes#1755
Message-Id: <1476361262-28723-1-git-send-email-penberg@scylladb.com>
pending_endpoints_for is called frequently by
storage_proxy::create_write_response_handler when doing cql query.
Before this patch, each call to pending_endpoints_for involves
converting a multimap (std::unordered_multimap<range<token>,
inet_address>>) to map (std::unordered_map<range<token>,
std::unordered_set<inet_address>>).
To speed up the token to pending endpoint mapping search, a interval map
is introduced. It is faster than searching the map linearly and can
avoid caching the token/pending endpoint mapping.
With this patch, the operations per second drop during adding node
period gets much better.
Before:
45K to 10K
After:
45k to 38K
(The number is measured with the streaming code skipping to send data to
rule out the streaming factor.)
Refs: #1223
Change the name used with class_registrator from "EverywhereReplicationStrategy"
(used in the initial patch from CASSANDRA-826 JIRA) to "EverywhereStrategy"
as it is in the current DCE code.
With this change one will be able to create an instance of
everywhere_replication_strategy class by giving either
an "org.apache.cassandra.locator.EverywhereStrategy" (full name) or
an "EverywhereStrategy" (short name) as a replication strategy name.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1456081258-937-1-git-send-email-vladz@cloudius-systems.com>
This strategy would ignore an RF configuration and would
always try to replicate on all cluster nodes.
This means that its get_replication_factor() would return a
number of currently "known" nodes in the cluster and
if a cluster is currently bootstrapping this value obviously may
change in time for the same key. Therefore using this strategy
should be done with caution.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1456074333-15014-3-git-send-email-vladz@cloudius-systems.com>
Return a number of currently known endpoints when
it's needed in a fast path flow.
Calling a get_all_endpoints().size() for that matter
would not be fast enough because of the unordered_set->vector transformation
we don't need.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1456074333-15014-2-git-send-email-vladz@cloudius-systems.com>
We can't move-from in the loop because the subject will be empty in
all but the first iteration.
Fixes crash during node stratup:
"Exiting on unhandled exception of type 'runtime_exception': runtime error: Invalid token. Should have size 8, has size 0"
Fixes update_cluster_layout_tests.py:TestUpdateClusterLayout.simple_add_node_1_test (and probably others)
Signed-off-by: Tomasz Grabiec <tgrabiec@scylladb.com>
Like we did in commit d54c77d5d0,
make the remaining functions in abstract_replication_strategy return
non-wrap-around ranges.
This fixes:
ERROR [shard 0] stream_session - [Stream #f0b7fda0-cf3e-11e5-b6c4-000000000000]
stream_transfer_task: Fail to send to 127.0.0.4:0: std::runtime_error (Not implemented: WRAP_AROUND)
in streaming.
Message-Id: <514d2a9a1d3b868d213464c8858ac5162c0338d8.1455093643.git.asias@scylladb.com>
The main motivation behind this change is to make get_ranges() easier for
consumers to work with the returned ranges, e.g. binary search to find a
range in which a token is contained. In addition, a wrap-around range
introduces corner cases, so we should avoid it altogether.
Suppose that a node owns three tokens: -5, 6, 8
get_ranges() would return the following ranges:
(8, -5], (-5, 6], (6, 8]
get_ranges() will now return the following ranges:
(-inf, -5], (-5, 6], (6, 8], (8, +inf)
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <4bda1428d1ebbe7c8af25aa65119edc5b97bc2eb.1453827605.git.raphaelsc@scylladb.com>
Don't demand the messaging_service version to be the same on both
sides of the connection in order to use internal addresses.
Upstream has a similar change for CASSANDRA-6702 in commit a7cae32 ("Fix
ReconnectableSnitch reconnecting to peers during upgrade").
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1452686729-32629-1-git-send-email-vladz@cloudius-systems.com>
Because our shutdown process is crippled (refs #293), we won't shutdown the
snitch correctly, and the sharded<> instance can assert during shutdown.
This interferes with the next patch, which adds orderly shutdown if the http
server fails to start.
Leak it intentionally to work around the problem.
Message-Id: <1452092806-11508-2-git-send-email-avi@scylladb.com>
We have an API that wraps open_file_dma which we use in some places, but in
many other places we call the reactor version directly.
This patch changes the latter to match the former. It will have the added benefit
of allowing us to make easier changes to these interfaces if needed.
Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <29296e4ec6f5e84361992028fe3f27adc569f139.1451950408.git.glauber@scylladb.com>
When replacing a node, we might ignore the tokens so that the tokens is
empty. In this case, we will have
std::unordered_map<inet_address, std::unordered_set<token>> = {ip, {}}
passed to token_metadata::update_normal_tokens(std::unordered_map<inet_address,
std::unordered_set<token>>& endpoint_tokens)
and hit the assert
assert(!tokens.empty());
The get_token_endpoint API should return a map of tokens to endpoints,
including the bootstrapping ones.
Use get_local_storage_service().get_token_to_endpoint_map() for it.
$ nodetool -p 7100 status
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 127.0.0.1 12645 256 ? eac5b6cf-5fda-4447-8104-a7bf3b773aba rack1
UN 127.0.0.2 12635 256 ? 2ad1b7df-c8ad-4cbc-b1f1-059121d2f0c7 rack1
UN 127.0.0.3 12624 256 ? 61f82ea7-637d-4083-acc9-567e0c01b490 rack1
UJ 127.0.0.4 ? 256 ? ced2725e-a5a4-4ac3-86de-e1c66cecfb8d rack1
Fixes#617
With this patch, start two nodes
node 1:
scylla --rpc-address 127.0.0.1 --broadcast-rpc-address 127.0.0.11
node 2:
scylla --rpc-address 127.0.0.2 --broadcast-rpc-address 127.0.0.12
On node 1:
cqlsh> SELECT rpc_address from system.peers;
rpc_address
-------------
127.0.0.12
which means client should use this address to connect node 2 for cql and
thrift protocol.
Don't ignore yet another returned future in reload_configuration().
Since commit 5e8037b50a
storage_service::gossip_snitch_info() returns a future.
This patch takes this into an account.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
When we access a gossiper instance we use a _gossip_started
state of a snitch, which is set in a gossiper_starting() method.
gossiper_starting() method however is invoked by a gossiper on CPU0
only therefore the _gossip_started snitch state will be set for an
instance on CPU0 only.
Therefore instead of synchronizing the _gossip_started state between
all shards we just have to make sure we check it on the right CPU,
which is CPU0.
This patch fixes this issue.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Adjust the interface and distribution of prefer_local parameter read
from a snitch property file with the rest of similar parameters (e.g. dc and rack):
they are read and their values are distributed (copied) across all shards'
instances.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Make reload_gossiper_state() be a virtual method
of a base class in order to allow calling it using a snitch_ptr
handle.
A base class already has a ton of virtual methods so no harm is
done performance-wise. Using virtual methods instead of doing
dynamic_cast results in a much cleaner code however.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Move the member and add an access method.
This is needed in order to be able to access this state using
snitch_ptr handle.
This also allows to get rid of ec2_multi_region_snitch::_helper_added
member since it duplicates _gossip_started semantics.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
This functions were empty and now they have the intended code:
- Register the reconnectable_snitch_helper if "prefer_local"
parameter was given the TRUE value.
- Set the application INTERNAL_IP state to listen_address().
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
- Invoke reload_gossiper_state() and gossip_snitch_info() on CPU0 since
gossiper is effectively running on CPU0 therefore all methods
modifying its state should be invoked on CPU0 as well.
- Don't invoke any method on external "distributed" objects unless their
corresponding per-shard service object have already been initialized.
- Update a local Node info in a storage_service::token_metadata::topology
when reloading snitch configuration when DC and/or Rack info has changed.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Introduce a subscribers_list class that exposes 3 methods:
- push_back(s) - adds a new element s to the back of the list
- remove(s) - removes an element s from the list
- for_each(f) - invoke f on each element of the list
- make a subscriber_list store shared_ptr to a subscriber
to allow removing (currently it stores a naked pointer to the object).
subscribers_list allows push_back() and remove() to be called while
another thread (e.g. seastar::async()) is in the middle of for_each().
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
New in v2:
- Simplify subscribers_list::remove() method.
- load_broadcaster: inherit from enable_shared_from_this instead
of async_sharded_service.
This snitch in addition to what EC2Snitch does registers
a reconnectable_snitch_helper that will make messenger_service
connect to internal IPs when it connects to the nodes in the same
data center with the current Node.
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
New in v4:
- Added dual license in newly added files.
New in v3:
- Returned the Apache license.
New in v2:
- Update the license to the latest version. ;)
reconnectable_snitch_helper implements i_endpoint_state_change_subscriber
and triggers reconnect using the internal IP to the nodes in the
same data center when one of the following events happen:
- on_join()
- on_change() - when INTERNAL_IP state is changed
- on_alive()
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
New in v4:
- Added dual license for newly added files.
New in v3:
- Fix reconnect() logic.
- Returned the Apache license.
- Check if the new local address is not already stored in the cache.
- Get rid of get_ep_addr().
New in v2:
- Update the license to the latest version. ;)