All uses of snitch not have their own local referece. The global
instance can now be replaced with the one living in main (and tests)
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The method replaces snitch instance on the existing sharded<snitch_ptr>
and the "existing" is nowadays the global instance. This patch changes
it to use local reference passed from API code
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
"
There's an ongoing effort to move the endpoint -> {dc/rack} mappings
from snitch onto topology object and this set finalizes it. After it the
snitch service stops depending on gossiper and system keyspace and is
ready for de-globalization. As a nice side-effect the system keyspace no
longer needs to maintain the dc/rack info cache and its starting code gets
relaxed.
refs: #2737
refs: #2795
"
* 'br-snitch-dont-mess-with-topology-data-2' of https://github.com/xemul/scylla: (23 commits)
system_keyspace: Dont maintain dc/rack cache
system_keyspace: Indentation fix after previous patch
system_keyspace: Coroutinuze build_dc_rack_info()
topology: Move all post-configuration to topology::config
snitch: Start early
gossiper: Do not export system keyspace
snitch: Remove gossiper reference
snitch: Mark get_datacenter/_rack methods const
snitch: Drop some dead dependency knots
snitch, code: Make get_datacenter() report local dc only
snitch, code: Make get_rack() report local rack only
storage_service: Populate pending endpoint in on_alive()
code: Populate pending locations
topology: Put local dc/rack on topology early
topology: Add pending locations collection
topology: Make get_location() errors more verbose
token_metadata: Add config, spread everywhere
token_metadata: Hide token_metadata_impl copy constructor
gosspier: Remove messaging service getter
snitch: Get local address to gossip via config
...
Following up on 69aea59d97, which added fencing support
for simple reads and writes, this series does the same for the
complex ops:
- partition scan
- counter mutation
- paxos
With this done, the coordinator knows about all in-flight requests and
can delay topology changes until they are retired.
Closes#11296
* github.com:scylladb/scylladb:
storage_proxy: hold effective_replication_map for the duration of a paxos transaction
storage_proxy: move paxos_response_handler class to .cc file
storage_proxy: deinline paxos_response_handler constructor/destructor
storage_proxy: use consistent effective_replication_map for counter coordinator
storage_proxy: improve consistency in query_partition_key_range{,_concurrent}
storage_proxy: query_partition_key_range_concurrent: reduce smart pointer use
storage_proxy: query_partition_key_range_concurrent: improve token_metadata consistency
storage_proxy: query_singular: use fewer smart pointers
storage_proxy: query_singular: simplify lambda captures
locator: effective_replication_map: provide non-smart-pointer accessor to token_metadata
storage_proxy: use consistent token_metadata with rest of singular read
token_metadata is protected by holders of an effective_replication_map_ptr,
so it's just as safe and less expensive for them to obtain a reference
to token_metadata rather than a smart pointer, so give them that option with
a new accessor.
EC2 instance metadata service can be busy, ret's retry to connect with
interval, just like we do in scylla-machine-image.
Fixes#10250
Signed-off-by: Takuya ASADA <syuu@scylladb.com>
Closes#11688
Because of snitch ex-dependencies some bits on topology were initialized
with nasty post-start calls. Now it all can be removed and the initial
topology information can be provided by topology::config
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
It doesn't need gossiper any longer. This change will allow starting
snitch early by the next patch, and eventually improving the
token-metadata start-up sequence
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
They are in fact such, but wasn't marked as const before because they
wanted to talk to non-const gossiper and system_keyspaces methods and
updated snitch internal caches
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
After previous patches and merged branches snitch no longer needs its
method that gets dc/rack for endpoints from gossiper, system keyspace
and its internal caches.
This cuts the last but the biggest snitch->gossiper dependency.
Also this removes implicit snitch->system_keyspace dependency loop
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The continuation of the previous patch -- all the code uses
topology::get_datacenter(endpoint) to get peers' dc string. The topology
still uses snitch for that, but it already contains the needed data.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
All the code out there now calls snitch::get_rack() to get rack for the
local node. For other nodes the topology::get_rack(endpoint) is used.
Since now the topology is properly populated with endpoints, it can
finally be patched to stop using snitch and get rack from its internal
collections
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Previous patches added the concept of pending endpoints in the topology,
this patch populates endpoints in this state.
Also, the set_pending_ranges() is patched to make sure that the tokens
added for the enpoint(s) are added for something that's known by the
topology. Same check exists in update_normal_tokens()
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Startup code needs to know the dc/rack of the local node early, way
before nodes starts any communication with the ring. This information is
available when snitch activates, but it starts _after_ token-metadata,
so the only way to put local dc/rack in topology is via a startup-time
special API call. This new init_local_endpoint() is temporary and will
be removed later in this set
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Nowadays the topology object only keeps info about nodes that are normal
members of the ring. Nodes that are joining or bootstrapping or leaving
are out of it. However, one of the goals of this patchset is to make
topology object provide dc/rack info for _all_ nodes, even those in
transitive state.
The introduced _pending_locations is about to hold the dc/rack info for
transitive endpoints. When a node becomes member of the ring it is moved
from pending (if it's there) to current locations, when it leaves the
ring it's moved back to pending.
For now the new collection is just added and the add/remove/get API is
extended to maintain it, but it's not really populated. It will come in
the next patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Currently if topology.get_location() doesn't find an entry in its
collection(s) it throws standard out-of-range exception which's very
hard to debug.
Also, next patches will extend this method, the introduced here if
(_current_locations.contains()) makes this future change look nicer.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Next patches will need to provide some early-start data for topology.
The standard way of doing it is via service config, so this patch adds
one. The new config is empty in this patch, to be filled later
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Copying of token_metadata_impl is heavy operation and it's performed
internally with the help of the dedicated clone_async() method. This
method, in turn, doesn't copy the whole object in its copy-ctor, but
rather default-initializes it and carries the remaining fields later.
Having said that, the standart copy-ctor is better to be made private
and, for the sake of being more explicit, marked as shallow-copy-ctor
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The property-file snitch gossips listen_address as internal-IP state. To
get this value it gets it from snitch->gossiper->messaging_service
chain. This change provides the needed value via config thus cutting yet
another snitch->gossiper dependency and allowing gossiper not to export
messaging service in the future
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The goal is not to default initialize an object when its fields are about to be immediately overwritten by the consecutive code.
Closes#11619
* github.com:scylladb/scylladb:
replication_strategy: Construct temp tokens in place
topology: Define copy-sonctructor with init-lists
Nowadays it's done inside snitch, and snitch needs to carry gossiper
refernece for that. There's an ongoing effort in de-globalizing snitch
and fixing its dependencies. This patch cuts this snitch->gossiper link
to facilitate the mentioned effort.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Otherwise, the token_metadata object is default-initialized, then it's
move-assigned from another object.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Otherwise the topology is default-constructed, then its fields
are copy-assigned with the data from the copy-from reference.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The token_metadata::calculate_pending_ranges_for_bootstrap() makes a
clone of itself and adds bootstrapping nodes to the clone to calculate
ranges. Currently added nodes lack the dc/rack which affects the
calculations the bad way.
Unfortunately, the dc/rack for those nodes is not available on topology
(yet) and needs pretty heavy patching to have. Fortunately, the only
caller of this method has gossiper at hand to provide the dc/rack from.
fixes: #11531
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#11596
This std::function causes allocations, both on construction
and in other operations. This costs ~2200 instructions
for a DC-local query. Fix that.
Closes#11494
Continuation to debfcc0e (snitch: Move sort_by_proximity() to topology).
The passed addresses are not modified by the helper. They are not yet
const because the method was copy-n-pasted from snitch where it wasn't
such.
tests: unit(dev)
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20220906074708.29574-1-xemul@scylladb.com>
The mutable get_datacenter_endpoints() and get_datacenter_racks() are
dangerous since they expose internal members without enforcing class
invariants. Fortunately they are unused, so delete them.
Closes#11454
There's one corner case in nodes sorting by snitch. The simple snitch
code overloads the call and doesn't sort anything. The same behavior
should be preserved by (future) topology implementation, but it doesn't
know the snitch name. To address that the patch adds a boolean switch on
topology that's turned off by main code when it sees the snitch is
"simple" one.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The method is about to be moved from snitch to topology, this patch
prepares the rest of the code to use the latter to call it. The
topology's method just calls snitch, but it's going to change in the
next patch.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There are two sorting methods in snitch -- one sorts the list of
addresses in place, the other one creates a sorted copy of the passed
const list (in fact -- the passed reference is not const, but it's not
modified by the method). However, both callers of the latter anyway
create their own temporary list of address, so they don't really benefit
from snitch generating another copy.
So this patch leaves just one sorting method -- the in-place one.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Proxy is the only place that calls this method. Also the method name
suggests it's not something "generic", but rather an internal logic of
proxy's query processing.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
"
Messaging needs to know DC/RACK for nodes to decide whether it needs to
do encryption or compression depending on the options. As all the other
services did it still uses snitch to get it, but simple switch to use
topology needs extra care.
The thing is that messaging can use internal IP instead of endpoints.
Currently it's snitch who tries har^w somehow to resolve this, in
particular -- if the DC/RACK is not found for the given argument it
assumes that it might be internal IP and calls back messaging to convert
it to the endpoint. However, messaging does know when it uses which
address and can do this conversion itself.
So this set eliminates few more global snitch usages and drops the
knot tieing snitch, gossiper and messaging with each-other.
"
* 'br-messaging-use-topology-1.2' of https://github.com/xemul/scylla:
messaging: Get DC/RACK from topology
messaging, topology: Keep shared_token_metadata* on messaging
messaging: Add is_same_{dc|rack} helpers
snitch, messaging: Dont relookup dc/rack on internal IP
Recent change in topology (commit 4cbe6ee9 titled
"topology: Require entry in the map for update_normal_tokens()")
made token_metadata::update_normal_tokens() require the entry presense
in the embedded topology object. Respectively, the commit in question
equipped most callers of update_normal_tokens() with preceeding
topology update call to satisfy the requirement.
However, tokens are put into token_metadata not only for normal state,
but also for bootstrapping, and one place that added bootstrapping
tokens errorneously got topology update. This is wrong -- node must
not be present in the topology until switching into normal state. As
the result several tests with bootstrapping nodes started to fail.
The fix removes topology update for bootstrapping nodes, but this
change reveals few other places that piggy-backed this mistaken
update, so noy _they_ need to update topology themselves.
tests: https://jenkins.scylladb.com/job/releng/job/Scylla-CI/2040/
update_cluster_layout_tests.py::test_simple_add_new_node_while_schema_changes_with_repair
update_cluster_layout_tests.py::test_simple_kill_new_node_while_bootstrapping_with_parallel_writes_in_multidc
repair_based_node_operations_test.py::test_lcs_reshape_efficiency
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20220902082753.17827-1-xemul@scylladb.com>
When getting dc/rack snitch may perform two lookups -- first time it
does it using the provided IP, if nothing is found snitch assumes that
the IP is internal one, gets the corresponding public one and searches
again.
The thing is that the only code that may come to snitch with internal
IP is the messaging service. It does so in two places: when it tries
to connect to the given endpoing and when it accepts a connection.
In the former case messaging performs public->internal IP conversion
itself and goes to snitch with the internal IP value. This place can get
simpler by just feeding the public IP to snich, and converting it to the
internal only to initiate the connection.
In the latter case the accepted IP can be either, but messaging service
has the public<->private map onboard and can do the conversion itself.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Previous patches made all the callers of topology.update_endpoint()
(via token_metadata.update_topology()) provide correct dc/rack info
for the endpoint. It's now possible to stop using global snitch by
topology and just rely on the dc/rack argument.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The topology::update_endpoint() is now a plain wrapper over private
::add_endpoint() method of the same class. It's simpler to merge them
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The method in question tries to be on the safest side and adds the
enpoint for which it updates the tokens into the topology. From now on
it's up to the caller to put the endpoint into topology in advance.
So most of what this patch does is places topology.update_endpoint()
into the relevant places of the code.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The method in question populates topology's internal maps with endpoint
vs dc/rack relations. As for today the dc/rack values are taken from the
global snitch object (which, in turn, goes to gossiper, system keyspace
and its internal non-updateable cache for that).
This patch prepares the ground for providing the dc/rack externally via
argument. By now it's just and argument with empty strings, but next
patches will populate it with real values (spoiler: in 99% it's storage
service that calls this method and each call will know where to get it
from for sure)
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The method creates a copy of token metadata and pushes an endpoint (with
some tokens) into it. Next patches will require providing dc/rack info
together with the endpoint, this patch prepares for that.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
"
On token_metadata there are two update_normal_tokens() overloads --
one updates tokens for a single endpoint, another one -- for a set
(well -- std::map) of them. Other than updating the tokens both
methods also may add an endpoint to the t.m.'s topology object.
There's an ongoing effort in moving the dc/rack information from
snitch to topology, and one of the changes made in it is -- when
adding an entry to topology, the dc/rack info should be provided
by the caller (which is in 99% of the cases is the storage service).
The batched tokens update is extremely unfriendly to the latter
change. Fortunately, this helper is only used by tests, the core
code always uses fine-grained tokens updating.
"
* 'br-tokens-update-relax' of https://github.com/xemul/scylla:
token_metadata: Indentation fix after prevuous patch
token_metadata: Remove excessive empty tokens check
token_metadata: Remove batch tokens updating method
tests: Use one-by-one tokens updating method
After the previous patch empty passed tokens make the helper co_return
early, so this if is the dead code
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
No users left.
The endpoint_tokens.empty() check is removed, only tests could trigger
it, but they didn't and are patched out.
Indentation is left broken
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The get_pending_address_ranges() accepting a single token is not in use,
its peer that accepts a set of tokens is
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#11358
For replication strategies like "everywhere"
and "local" that return the same set of endpoints
for all tokens, we can call rs->calculate_natural_endpoints
one once and reuse the result for all token.
Note that ideally the replication_map could contain only
a single token range for this case, but that does't seem to work yet.
Add maybe_yield() calls to the tight loop
to prevent reactor stalls on large clusters when copying
a long vector returned by everywhere_replication_strategy
to potentially 1000's of tokens in the map.
Nicholas Peshek wrote in
https://github.com/scylladb/scylladb/issues/10337#issuecomment-1211152370
about similar patch by Geoffrey Beausire:
994c6ecf3c
> Yep. That dropped our startup from 3000+ seconds to about 40.
Fixes#10337
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
So that using calaculate_natural_endpoints can be optimized
for strategies that return the same endpoints for all tokens,
namely everywhere_replication_strategy and local_strategy.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>