Commit Graph

178 Commits

Author SHA1 Message Date
Duarte Nunes
2f05d7423a locator/reconnectable_snitch_helper: Avoid versioned_value copies
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-10-11 10:02:32 +01:00
Duarte Nunes
28d63a76df locator/production_snitch_base: Cleanup get_endpoint_info()
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-10-11 10:02:32 +01:00
Duarte Nunes
ceebbe14cc gossiper: Avoid endpoint_state copies
gossiper::get_endpoint_state_for_endpoint() returns a copy of
endpoint_state, which we've seen can be very expensive.

This patch adds a similar function which returns a pointer instead,
and changes the call sites where using the pointer-returning variant
is deemed safe (the pointer neither escapes the function, nor crosses
any defer point).

Fixes #764

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-10-10 13:48:02 +01:00
Tomasz Grabiec
46c7e06e56 locator: Optimize token_metadata::is_member()
Currently it's linear in the number of tokens in the system in the
worst case. We could use the knowledge which _topology has to make it
O(1).

Fixes #2873.

Message-Id: <1507630182-13410-1-git-send-email-tgrabiec@scylladb.com>
2017-10-10 14:27:54 +03:00
Calle Wilund
dd2b8821a4 everywhere_strategy: Make get_natural_endpoints handle non-init state
Make get_natural_endpoints return local address iff token metadata
is not yet setup (since that is the one address we already know of).

If a request has a consistency level requiring more endpoints, it
will still fail, but for calls with, for example, CL=ONE, at startup
we will succeed, and more or less act like local strategy. Yet,
further down the line, have data distributed as desired.

Acked-by: Gleb Natapov <gleb@scylladb.com>
Message-Id: <20170926113512.15707-1-calle@scylladb.com>
2017-09-26 15:21:30 +03:00
Asias He
0ec574610d locator: Get rid of assert in token_metadata
In commit 69c81bcc87 (repair: Do not allow repair until node is in
NORMAL status), we saw a coredump due to an assert in
token_metadata::first_token_index.

Throw an exception instead of abort the whole scylla process.
Message-Id: <c110645cee1ee3897e30a3ae1b7ab3f49c97412c.1504752890.git.asias@scylladb.com>
2017-09-14 10:33:02 +03:00
Avi Kivity
27d3ab20a9 locator: add missing include "log.hh"
It's currently made available via another include, which is going away.
2017-08-27 15:17:05 +03:00
Avi Kivity
ebaeefa02b Merge seatar upstream (seastar namespace)
- introcduced "seastarx.hh" header, which does a "using namespace seastar";
 - 'net' namespace conflicts with seastar::net, renamed to 'netw'.
 - 'transport' namespace conflicts with seastar::transport, renamed to
   cql_transport.
 - "logger" global variables now conflict with logger global type, renamed
   to xlogger.
 - other minor changes
2017-05-21 12:26:15 +03:00
Vlad Zolotarov
181c68e97d token_metadata::get_host_id(ep): add a missing 'throw'
Caught by PVS-Studio static analyzer:

The object was created but it is not being used. The 'throw' keyword could be missing: throw runtime_error(FOO);

Reported-by: Phillip Khandeliants
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-04-26 14:54:34 -04:00
Asias He
937f28d2f1 Convert to use dht::partition_range_vector and dht::token_range_vector 2016-12-19 14:08:50 +08:00
Asias He
e5485f3ea6 Get rid of query::partition_range
Use dht::partition_range instead
2016-12-19 08:09:25 +08:00
Asias He
d1178fa299 Convert to use dht::token_range 2016-12-19 08:04:29 +08:00
Asias He
e523803a5d token_metadata: Introduce interval_to_range helper
It is used to convert a boost::icl::interval<token> interval back to a
range<token>.
2016-12-12 11:09:26 +08:00
Avi Kivity
a35136533d Convert ring_position and token ranges to be nonwrapping
Wrapping ranges are a pain, so we are moving wrap handling to the edges.

Since cql can't generate wrapping ranges, this means thrift and the ring
maintenance code; also range->ring transformations need to merge the first
and last ranges.

Message-Id: <1478105905-31613-1-git-send-email-avi@scylladb.com>
2016-11-02 21:04:11 +02:00
Pekka Enberg
3b4e6cdc5e abstract_replication_strategy: Fix exception type if class not found
Change abstract_replication_strategy::create_replication_strategy() to
throw exceptions::configuration_error if replication strategy class
lookup to make sure the error is converted to the correct CQL response.

Fixes #1755

Message-Id: <1476361262-28723-1-git-send-email-penberg@scylladb.com>
2016-10-13 17:39:28 +03:00
Vlad Zolotarov
c616e74ae4 locator::gossiping_property_file_snitch: use a lowres_clock time source for a timer
gossiping_property_file_snitch checks a configuration file every 60s.
lowres_clock clock source should be good enough for that.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1465314448-11611-1-git-send-email-vladz@cloudius-systems.com>
2016-06-15 13:01:05 +03:00
Asias He
089734474b token_metadata: Speed up pending_endpoints_for
pending_endpoints_for is called frequently by
storage_proxy::create_write_response_handler when doing cql query.

Before this patch, each call to pending_endpoints_for involves
converting a multimap (std::unordered_multimap<range<token>,
inet_address>>) to map (std::unordered_map<range<token>,
std::unordered_set<inet_address>>).

To speed up the token to pending endpoint mapping search, a interval map
is introduced. It is faster than searching the map linearly and can
avoid caching the token/pending endpoint mapping.

With this patch, the operations per second drop during adding node
period gets much better.

Before:
45K to 10K

After:
45k to 38K

(The number is measured with the streaming code skipping to send data to
rule out the streaming factor.)

Refs: #1223
2016-05-17 17:32:15 +08:00
Asias He
ffe91b5755 token_metadata: Do not assert in get_host_id
Throw an exception instead of assert.
2016-04-13 14:53:27 +08:00
Pekka Enberg
38a54df863 Fix pre-ScyllaDB copyright statements
People keep tripping over the old copyrights and copy-pasting them to
new files. Search and replace "Cloudius Systems" with "ScyllaDB".

Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>
2016-04-08 08:12:47 +03:00
Asias He
f7fccc6efb locator: Fix get token from a range<token>
With a range{t1, t2}, if t2 == {}, the range.end() will contain no
value. Fix getting t2 in this case.

Fixes #911.
Message-Id: <4462e499d706d275c03b116c4645e8aaee7821e1.1456128310.git.asias@scylladb.com>
2016-02-23 14:29:26 +01:00
Vlad Zolotarov
f2c6f16a50 locator: everywhere_replication_strategy: change the class_registrator name to "EverywhereStrategy"
Change the name used with class_registrator from "EverywhereReplicationStrategy"
(used in the initial patch from CASSANDRA-826 JIRA) to "EverywhereStrategy"
as it is in the current DCE code.

With this change one will be able to create an instance of
everywhere_replication_strategy class by giving either
an "org.apache.cassandra.locator.EverywhereStrategy" (full name) or
an "EverywhereStrategy" (short name) as a replication strategy name.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1456081258-937-1-git-send-email-vladz@cloudius-systems.com>
2016-02-22 09:18:47 +02:00
Vlad Zolotarov
cc30956c56 locator: added EverywhereReplicationStrategy
This strategy would ignore an RF configuration and would
always try to replicate on all cluster nodes.

This means that its get_replication_factor()  would return a
number of currently "known" nodes in the cluster and
if a cluster is currently bootstrapping this value obviously may
change in time for the same key. Therefore using this strategy
should be done with caution.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1456074333-15014-3-git-send-email-vladz@cloudius-systems.com>
2016-02-21 19:29:29 +02:00
Vlad Zolotarov
ec14fb2a70 locator: token_metadata: add get_all_endpoints_count()
Return a number of currently known endpoints when
it's needed in a fast path flow.

Calling a get_all_endpoints().size() for that matter
would not be fast enough because of the unordered_set->vector transformation
we don't need.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1456074333-15014-2-git-send-email-vladz@cloudius-systems.com>
2016-02-21 19:29:28 +02:00
Tomasz Grabiec
efdbc3d6d7 abstract_replication_strategy: Fix generation of token ranges
We can't move-from in the loop because the subject will be empty in
all but the first iteration.

Fixes crash during node stratup:

  "Exiting on unhandled exception of type 'runtime_exception': runtime error: Invalid token. Should have size 8, has size 0"

Fixes update_cluster_layout_tests.py:TestUpdateClusterLayout.simple_add_node_1_test (and probably others)

Signed-off-by: Tomasz Grabiec <tgrabiec@scylladb.com>
2016-02-12 19:38:36 +01:00
Asias He
6d0407503b locator: Do not generate wrap-around ranges
Like we did in commit d54c77d5d0,
make the remaining functions in abstract_replication_strategy return
non-wrap-around ranges.

This fixes:

ERROR [shard 0] stream_session - [Stream #f0b7fda0-cf3e-11e5-b6c4-000000000000]
stream_transfer_task: Fail to send to 127.0.0.4:0: std::runtime_error (Not implemented: WRAP_AROUND)

in streaming.
Message-Id: <514d2a9a1d3b868d213464c8858ac5162c0338d8.1455093643.git.asias@scylladb.com>
2016-02-10 10:03:31 +01:00
Raphael S. Carvalho
d54c77d5d0 change abstract_replication_strategy::get_ranges to not return wrap-arounds
The main motivation behind this change is to make get_ranges() easier for
consumers to work with the returned ranges, e.g. binary search to find a
range in which a token is contained. In addition, a wrap-around range
introduces corner cases, so we should avoid it altogether.

Suppose that a node owns three tokens: -5, 6, 8

get_ranges() would return the following ranges:
(8, -5], (-5, 6], (6, 8]
get_ranges() will now return the following ranges:
(-inf, -5], (-5, 6], (6, 8], (8, +inf)

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <4bda1428d1ebbe7c8af25aa65119edc5b97bc2eb.1453827605.git.raphaelsc@scylladb.com>
2016-01-27 09:48:31 +01:00
Vlad Zolotarov
e3d7db5e57 ec2_snitch: complete the EC2Snitch -> Ec2Snitch renaming
The rename started in 72b27a91fe
was not complete. This patch fixes the places that were missed
in the above patch.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1453375025-7512-3-git-send-email-vladz@cloudius-systems.com>
2016-01-21 13:35:30 +02:00
Vlad Zolotarov
9951edde1a locator::ec2_multi_region_snitch: add a get_name() implementation
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1453375025-7512-2-git-send-email-vladz@cloudius-systems.com>
2016-01-21 13:35:29 +02:00
Vlad Zolotarov
922eb218b1 locator::reconnectable_snitch_helper: don't check messaging_service version
Don't demand the messaging_service version to be the same on both
sides of the connection in order to use internal addresses.

Upstream has a similar change for CASSANDRA-6702 in commit a7cae32 ("Fix
ReconnectableSnitch reconnecting to peers during upgrade").

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
Message-Id: <1452686729-32629-1-git-send-email-vladz@cloudius-systems.com>
2016-01-19 11:04:37 +02:00
Avi Kivity
fbe3283816 snitch: intentionally leak snitch singleton
Because our shutdown process is crippled (refs #293), we won't shutdown the
snitch correctly, and the sharded<> instance can assert during shutdown.
This interferes with the next patch, which adds orderly shutdown if the http
server fails to start.

Leak it intentionally to work around the problem.
Message-Id: <1452092806-11508-2-git-send-email-avi@scylladb.com>
2016-01-07 16:43:37 +02:00
Asias He
2345cda42f messaging_service: Rename shard_id to msg_addr
Use shard_id as the destination of the messaging_service is confusing,
since shard_id is used in the context of cpu id.
Message-Id: <8c9ef193dc000ef06f8879e6a01df65cf24635d8.1452155241.git.asias@scylladb.com>
2016-01-07 10:36:35 +02:00
Glauber Costa
74fbd8fac0 do not call open_file_dma directly
We have an API that wraps open_file_dma which we use in some places, but in
many other places we call the reactor version directly.

This patch changes the latter to match the former. It will have the added benefit
of allowing us to make easier changes to these interfaces if needed.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Message-Id: <29296e4ec6f5e84361992028fe3f27adc569f139.1451950408.git.glauber@scylladb.com>
2016-01-05 10:37:57 +02:00
Asias He
3793bb7be1 token_metadata: Add get_endpoint_to_token_map_for_reading 2015-12-09 12:30:52 +08:00
Asias He
1cc7887ffb token_metadata: Do nothing if tokens is empty.
When replacing a node, we might ignore the tokens so that the tokens is
empty. In this case, we will have

   std::unordered_map<inet_address, std::unordered_set<token>> = {ip, {}}

passed to token_metadata::update_normal_tokens(std::unordered_map<inet_address,
std::unordered_set<token>>& endpoint_tokens)

and hit the assert

   assert(!tokens.empty());
2015-12-09 12:30:52 +08:00
Asias He
110a18987e token_metadata: Print Token changing ownership from
Needed by test.
2015-12-09 12:30:52 +08:00
Asias He
52a5e954f9 gossip: Pass const ref for versioned_value in on_change and before_change 2015-12-09 12:29:15 +08:00
Asias He
e9a4d93d1b storage_service: Fix added node not showing up in nodetool in status joining
The get_token_endpoint API should return a map of tokens to endpoints,
including the bootstrapping ones.

Use get_local_storage_service().get_token_to_endpoint_map() for it.

$ nodetool -p 7100 status

Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address    Load       Tokens  Owns    Host ID Rack
UN  127.0.0.1  12645      256     ?  eac5b6cf-5fda-4447-8104-a7bf3b773aba  rack1
UN  127.0.0.2  12635      256     ?  2ad1b7df-c8ad-4cbc-b1f1-059121d2f0c7  rack1
UN  127.0.0.3  12624      256     ?  61f82ea7-637d-4083-acc9-567e0c01b490  rack1
UJ  127.0.0.4  ?          256     ?  ced2725e-a5a4-4ac3-86de-e1c66cecfb8d  rack1

Fixes #617
2015-12-09 10:43:51 +08:00
Asias He
aaca88a1e7 token_metadata: Add print_pending_ranges for debug print
Signed-off-by: Pekka Enberg <penberg@scylladb.com>
2015-11-30 11:07:42 +02:00
Asias He
7ddf8963f5 config: Enable broadcast_rpc_address option
With this patch, start two nodes

node 1:
scylla --rpc-address 127.0.0.1 --broadcast-rpc-address 127.0.0.11

node 2:
scylla --rpc-address 127.0.0.2 --broadcast-rpc-address 127.0.0.12

On node 1:
cqlsh> SELECT rpc_address from system.peers;

 rpc_address
-------------
  127.0.0.12

which means client should use this address to connect node 2 for cql and
thrift protocol.
2015-11-24 10:07:31 +08:00
Asias He
efda753c0c token_metadata: Implement pending_endpoints_for
It is used in storage_proxy::create_write_response_handler. The second
argument should be keyspace name instead of the keyspace class.

Refs: #539
2015-11-11 09:41:21 +02:00
Asias He
cb8b0eedfc token_metadata: Fix set_difference in calculate_pending_ranges
std::set_difference requires the container to be sorted.
2015-11-09 08:43:04 +08:00
Asias He
c90e9c97f5 token_metadata: Add add_moving_endpoint 2015-11-09 08:43:04 +08:00
Asias He
ada2466e18 token_metadata: Add clone_after_all_settled
Needed by storage_service::range_relocator::calculate_to_from_streams.
2015-11-09 08:43:04 +08:00
Vlad Zolotarov
b654d942b4 locator::gossiping_property_file_snitch: don't ignore a returned future
Don't ignore yet another returned future in reload_configuration().

Since commit 5e8037b50a
storage_service::gossip_snitch_info() returns a future.

This patch takes this into an account.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-11-02 13:44:53 +02:00
Vlad Zolotarov
689d5fb000 locator::gossiping_property_file_snitch: fix in reload_configuration()
When we access a gossiper instance we use a _gossip_started
state of a snitch, which is set in a gossiper_starting() method.

gossiper_starting() method however is invoked by a gossiper on CPU0
only therefore the _gossip_started snitch state will be set for an
instance on CPU0 only.

Therefore instead of synchronizing the _gossip_started state between
all shards we just have to make sure we check it on the right CPU,
which is CPU0.

This patch fixes this issue.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-11-02 13:44:53 +02:00
Vlad Zolotarov
5da4e62a59 locator::i_endpoint_snitch: align the _prefer_local parameter with _my_dc and _my_rack
Adjust the interface and distribution of prefer_local parameter read
from a snitch property file with the rest of similar parameters (e.g. dc and rack):
they are read and their values are distributed (copied) across all shards'
instances.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-11-02 13:44:53 +02:00
Vlad Zolotarov
5042f3c952 locator::i_endpoint_snitch_base: make reload_gossiper_state() a virtual function
Make reload_gossiper_state() be a virtual method
of a base class in order to allow calling it using a snitch_ptr
handle.

A base class already has a ton of virtual methods so no harm is
done performance-wise. Using virtual methods instead of doing
dynamic_cast results in a much cleaner code however.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-11-02 13:44:53 +02:00
Vlad Zolotarov
926ce145db locator::i_endpoint_snitch_base: move _gossip_started to the base class
Move the member and add an access method.
This is needed in order to be able to access this state using
snitch_ptr handle.

This also allows to get rid of ec2_multi_region_snitch::_helper_added
member since it duplicates _gossip_started semantics.

Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2015-11-02 13:44:31 +02:00
Asias He
e3c5a31e85 gossip: Futurize gossiper_starting
gossiper_starting calls gossiper::add_local_application_state which
returns a future, so futurize gossiper_starting as well.
2015-11-02 09:10:48 +08:00
Shlomi Livne
50cdcbd255 Update snitch registration EC2MultiRegionSnitch --> Ec2MultiRegionSnitch
Update snitch EC2MultiRegionSnitch to Ec2MultiRegionSnitch,
org.apache.cassandra.locator.EC2MultiRegionSnitch to
org.apache.cassandra.locator.Ec2MultiRegionSnitch

Signed-off-by: Shlomi Livne <shlomi@scylladb.com>
2015-11-01 15:21:26 +02:00