The next patch will disable stopping the keyspaces
in database shutdown due to #9684.
This will leave outstanding e_r_m:s when the factory
is destroyed. They must be unregistered from the factory
so they won't try to submit_background_work()
to gently clear their contents.
Support that temporarily until shutdown is fixed
to ensure they are no outstanding e_r_m:s when
the factory is destroyed, at which point this
can turn into an internal error.
Refs #8995
Refs #9684
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20211127083348.146649-1-bhalevy@scylladb.com>
Prevent reactor stalls by gently clearing the replication_map
and token_metadata_ptr when the effective_replication_map is
destroyed.
This is done in the background, protected by the
effective_replication_map_factory::stop() method.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Calculating a new effective_replication_map on each shard
is expensive. To try to save that, use the factory key to
look up an e_r_m on shard 0 and if found, use to to clone
its replication map and use that to make the shard-local
e_r_m copy.
In the future, we may want to improve that in 2 ways:
- instead of always going to shard 0, use hash(key) % smp::count
to create the first copy.
- make full copies only on NUMA nodes and keep a shared pointer
on all other shards.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
The effective_replication_map_factory keeps nakes pointers
to outstanding effective_replication_map:s.
These are kept valid using a shared effective_replication_map_ptr.
When the last shared ptr reference is dropped the effective_replication_map
object is destroyed, therefore the raw pointer to it in the factory
must be erased.
This now happens in ~effective_replication_map when the object
is marked as registered.
Registration happens when effective_replication_map_factory inserts
the newly created effective_replication_map to its _replication_maps
map, and the factory calles effective_replication_map::set_factory..
Note that effective_replication_map may be created temporarily
and not be inserted to the factory's map, therefore erase
is called only when required.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Make a factory key using the replication_strategy type
and config options, plus the token_metadata ring version
and use it to search an already-registred effective_replication_map.
If not found, calculate a new create_effective_replication_map
and register it using the above key.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
So a effective_replication_map_ptr can be generated
using a raw pointer by effective_replication_map_factory.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
To be used to locate the effective_replication_map
in the to-be-introduced effective_replication_map_factory.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
It will be used further to create shared copies
of effective_replication_map based on replication_strategy
type and config options.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Accessing tm.sorted_tokens().back() causes undefined behavior
if tm.sorted_tokens is empty.
Check that first and throw/abort using on_internal_error
in this case.
This will prevent the segfault but it doesn't fix the root cause
which is getting here with empty token_metadata. That will be fixed
by the following patch.
Refs #9494
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20211019075710.1626808-1-bhalevy@scylladb.com>
And use it from cql3 check_restricted_replication_strategy and
keyspace_metadata ctor that defined their own `replication_class_strategy`.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Equivalent to abstract_replication_strategy get_range_addresses,
yet synchronous, as it uses the precalculated map.
Call it from storage_service::get_new_source_ranges
and range_streamer::get_all_ranges_with_sources_for.
Consequently, get_new_source_ranges and removenode_add_ranges
can become synchronous too.
Unfortunately we can't entirely get rid of
abstract_replication_strategy::get_range_addresses
as it's still needed by
range_streamer::get_all_ranges_with_strict_sources_for.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
It is not used any more.
Methods either use the token_metadata_ptr in the
effective_replication_map, or receive an ad-hoc
token_metadata.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Prepare for deleting the _shared_token_metadata member.
All we need for recognized_options is the topology
(for network_topology_strategy).
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Now that abstract_replication_strategy methods are all async
clone_only_token_map_sync, and update_normal_tokens_sync
are unused.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
It is no longer in use.
And with it, the virtual calculate_natural_endpoint_sync method
of which it was the only caller.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Remaining callers of get_address_ranges and get_pending_address_ranges
are all either from a seastar thread or from a coroutine
so we can make the methods always async and drop the
can_yield param.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
All remaining use sites are called in a seastar thread
so we drop the can_yield param and make get_range_addresses
always async.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
It is called only from repair, in a thread,
so it can be made always async and the need_preempt param
can be dropped.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Provide a sync get_ranges method by effective_replication_map
that uses the precalculated map to get all token ranges owned by or
replicated on a given endpoint.
Reuse do_get_ranges as common infrastructure for all
3 cases: get_ranges, get_primary_ranges, and get_primary_ranges_within_dc.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Now that all falvors of get_natural_endpoints methods
were moved to effective_replication_map,
do_get_natural_endpoints and its overrides are unused.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Setting the ring version backwards means it got out of sync.
Possibly concurrent updates weren't serialized properly
using token_metadata_lock / mutate_token_metadata.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
For generating unique _ring_version.
Currently when we clone a mutable token_metadata_ptr
it remains with the same _ring_version
and the ring version is updated only when the topology changes.
To be able to distinguish these traqnsient copies
from the ones that got applied, be stricter about
the ring version and change it to a unique number
using a static counter.
Next patch will update the ring version
(and consequently invalidate the cached_endpoints
on the replication strategy) every time the token_metadata
changes, not only when the topology changes.
Note that the _cached_endpoints will go away
once the transition to effective_replication_map
is finished, so this will not degrade performance.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
implementation
Now that all users of it were converted to use the
effective_replication_map, the legacy
abstract_replication_strategy::get_natural_endpoints method
can be deleted.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Every time the token_metadata changes we need to update the
effective_replication_map on all non-system keyspaces.
Do that in replicate_to_all_cores after the updated token_metadata
has been replicated to all cores.
We first prepare and clone the token_metadata, then prepare
and clone the new effective_replication_maps. Any failure
at this stage is recoverable, handle via rollback and the exception
is returned.
Note that any failure to _apply_ the pending token_metadata or the
effective_replication_map will cause scylla to abort.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Serialize the metadata changes with
keyspace create, update, or drop.
This will become necessary in the following patch
when we update the effective_replication_map
on all keyspaces and we want instances on all shards
end up with the same replication map.
Note that storage_service::keyspace_changed is called
from the scheme_merge path so it already holds
the merge_lock.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
effective_replication_map holds the full replication_map
resulting from applying the effective replication strategy
over the given token_metadata and replication_strategy_config_options.
It is calculated once, in make_effective_replication_map(), and then it
can be used for retrieving the endpoints/token_ranges synchronously
from the precalculated map.
A new virtual get_natural_endpoints(const token&, const effective_replication_map&)
method has been added to abstract_replication_strategy so that
local_strategy and everywhere_replication_strategy can override it as they may be
needed before the token_metadata is established.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Enable creating shared_ptr<BaseClass> in nonstatic_class_registry
using BaseClass::ptr_type and use that for
abstract_replication_strategy.
While at it, also clean up compressor with that respect
to define compressor::ptr_type as shared_ptr<compressor>
thus simplifying compressor_registry.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
And with that rename calculate_natural_endpoints(const token& search_token, const token_metadata&, can_yield)
to do_calculate_natural_endpoints and make it protected,
With this patch, all its external users call the async version, so
rename it back to calculate_natural_endpoints, and make
calculate_natural_endpoints_sync private since it's being called
only within abstract_replication_strategy.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
calculate_natural_endpoints_sync and _async are both provided
temporarily until all users of them are converted to use
the async version which will remain.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Extract natural_endpoints_tracker out of calculate_natural_endpoints
so we easily split the function to sync and async variants.
Test: network_topology_strategy_test(dev, debug)
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Some globals in azure_snitch use std::string in the declaration
and auto in the definition. gcc 11 complains. I don't know if it's
correct, but it's easy to use the type in both declaration and
definition.
This warning can catch a virtual function that thinks it
overrides another, but doesn't, because the two functions
have different signatures. This isn't very likely since most
of our virtual functions override pure virtuals, but it's
still worth having.
Enable the warning and fix numerous violations.
Closes#9347
Not all calculate_natural_endpoints implementations respect can_yield
flag, for example, everywhere_replication_strategy.
This patch adds yield at the caller site to fix stalls we saw in
do_get_ranges.
Fixes#8943Closes#9139
This add support for Azure snitch. The work is an adaptation of
AzureSnitch for Apache Cassandra by Yoshua Wakeham:
https://raw.githubusercontent.com/yoshw/cassandra/9387-trunk/src/java/org/apache/cassandra/locator/AzureSnitch.java
Also change `production_snitch_base` to protect against
a snitch implementation setting DC and rack to an empty string,
which Lubos' says can happen on Azure.
Fixes#8593Closes#9084
* github.com:scylladb/scylla:
scylla_util: Use AzureSnitch on Azure
production_snitch_base: Fallback for empty DC or rack strings
azure_snitch: Azure snitch support