Commit Graph

381 Commits

Author SHA1 Message Date
Benny Halevy
93367ba55f effective_replication_map_factory: temporarily unregister outstanding maps when destroyed
The next patch will disable stopping the keyspaces
in database shutdown due to #9684.

This will leave outstanding e_r_m:s when the factory
is destroyed. They must be unregistered from the factory
so they won't try to submit_background_work()
to gently clear their contents.

Support that temporarily until shutdown is fixed
to ensure they are no outstanding e_r_m:s when
the factory is destroyed, at which point this
can turn into an internal error.

Refs #8995
Refs #9684

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20211127083348.146649-1-bhalevy@scylladb.com>
2021-11-29 11:59:44 +02:00
Benny Halevy
9d2631daaf token_metadata: calculate_pending_ranges_for_leaving: maybe yield
We see long stalls as reported in
https://github.com/scylladb/scylla/issues/8030#issuecomment-974783526

everywhere_replication_strategy::calculate_natural_endpoints
is synchronous and doesn't yield, so add maybe_yield() calls
when looping over many token ranges.

Refs #8030

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20211121090339.3955278-1-bhalevy@scylladb.com>
Message-Id: <20211121102606.76700-1-bhalevy@scylladb.com>
2021-11-22 10:48:25 +02:00
Benny Halevy
eed3e95704 effective_replication_map: clear_gently when destroyed
Prevent reactor stalls by gently clearing the replication_map
and token_metadata_ptr when the effective_replication_map is
destroyed.

This is done in the background, protected by the
effective_replication_map_factory::stop() method.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-19 10:52:41 +02:00
Benny Halevy
866e1b8479 effective_replication_map_factory: try cloning replication map from shard 0
Calculating a new effective_replication_map on each shard
is expensive.  To try to save that, use the factory key to
look up an e_r_m on shard 0 and if found, use to to clone
its replication map and use that to make the shard-local
e_r_m copy.

In the future, we may want to improve that in 2 ways:
- instead of always going to shard 0, use hash(key) % smp::count
to create the first copy.
- make full copies only on NUMA nodes and keep a shared pointer
on all other shards.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-19 10:52:41 +02:00
Benny Halevy
6754e6ca2b effective_replication_map: erase from factory when destroyed
The effective_replication_map_factory keeps nakes pointers
to outstanding effective_replication_map:s.
These are kept valid using a shared effective_replication_map_ptr.

When the last shared ptr reference is dropped the effective_replication_map
object is destroyed, therefore the raw pointer to it in the factory
must be erased.

This now happens in ~effective_replication_map when the object
is marked as registered.

Registration happens when effective_replication_map_factory inserts
the newly created effective_replication_map to its _replication_maps
map, and the factory calles effective_replication_map::set_factory..

Note that effective_replication_map may be created temporarily
and not be inserted to the factory's map, therefore erase
is called only when required.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-19 10:52:20 +02:00
Benny Halevy
8a6fbe800f effective_replication_map_factory: add create_effective_replication_map
Make a factory key using the replication_strategy type
and config options, plus the token_metadata ring version
and use it to search an already-registred effective_replication_map.

If not found, calculate a new create_effective_replication_map
and register it using the above key.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-19 10:46:51 +02:00
Benny Halevy
ecba37dbfd effective_replication_map: enable_lw_shared_from_this
So a effective_replication_map_ptr can be generated
using a raw pointer by effective_replication_map_factory.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-19 10:46:51 +02:00
Benny Halevy
f4f41e2908 effective_replication_map: define factory_key
To be used to locate the effective_replication_map
in the to-be-introduced effective_replication_map_factory.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-19 10:46:51 +02:00
Benny Halevy
3fed73e7c2 locator: add effective_replication_map_factory
It will be used further to create shared copies
of effective_replication_map based on replication_strategy
type and config options.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-19 10:46:51 +02:00
Avi Kivity
36919a4ed7 locator: replace seastar::sprint() with fmt::format()
sprint() is obsolete.
2021-10-27 17:02:00 +03:00
Benny Halevy
dc091fc952 effective_replication_map, abstract_replication_strategy: get_ranges: call on_internal_error in empty sorted_tokens case
Accessing tm.sorted_tokens().back() causes undefined behavior
if tm.sorted_tokens is empty.

Check that first and throw/abort using on_internal_error
in this case.

This will prevent the segfault but it doesn't fix the root cause
which is getting here with empty token_metadata.  That will be fixed
by the following patch.

Refs #9494

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20211019075710.1626808-1-bhalevy@scylladb.com>
2021-10-19 18:52:59 +03:00
Benny Halevy
e4dc81ec04 abstract_replication_strategy: add to_qualified_class_name
And use it from cql3 check_restricted_replication_strategy and
keyspace_metadata ctor that defined their own `replication_class_strategy`.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-18 12:13:25 +03:00
Benny Halevy
17296cba4b effective_replication_map: add get_range_addresses
Equivalent to abstract_replication_strategy get_range_addresses,
yet synchronous, as it uses the precalculated map.

Call it from storage_service::get_new_source_ranges
and range_streamer::get_all_ranges_with_sources_for.

Consequently, get_new_source_ranges and removenode_add_ranges
can become synchronous too.

Unfortunately we can't entirely get rid of
abstract_replication_strategy::get_range_addresses
as it's still needed by
range_streamer::get_all_ranges_with_strict_sources_for.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 16:10:06 +03:00
Benny Halevy
8c85197c6c abstract_replication_strategy: get rid of shared_token_metadata member and ctor param
It is not used any more.

Methods either use the token_metadata_ptr in the
effective_replication_map, or receive an ad-hoc
token_metadata.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 16:10:06 +03:00
Benny Halevy
91f2fd5f2c abstract_replication_strategy: recognized_options: pass const topology&
Prepare for deleting the _shared_token_metadata member.
All we need for recognized_options is the topology
(for network_topology_strategy).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 16:10:06 +03:00
Benny Halevy
4d2561ff75 abstract_replication_strategy: precacluate get_replication_factor for effective_replication_map
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 16:10:06 +03:00
Benny Halevy
d953e7b01a token_metadata: get rid of now-unused sync methods
Now that abstract_replication_strategy methods are all async
clone_only_token_map_sync, and update_normal_tokens_sync
are unused.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 16:10:06 +03:00
Benny Halevy
bdce6f93ca abstract_replication_strategy: get rid of do_calculate_natural_endpoints
It is no longer in use.

And with it, the virtual calculate_natural_endpoint_sync method
of which it was the only caller.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 16:10:06 +03:00
Benny Halevy
cbe58345b9 abstract_replication_strategy: futurize get_*address_ranges
Remaining callers of get_address_ranges and get_pending_address_ranges
are all either from a seastar thread or from a coroutine
so we can make the methods always async and drop the
can_yield param.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 16:10:06 +03:00
Benny Halevy
91581ba23a abstract_replication_strategy: futurize get_range_addresses
All remaining use sites are called in a seastar thread
so we drop the can_yield param and make get_range_addresses
always async.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 16:10:06 +03:00
Benny Halevy
3040e0a038 abstract_replication_strategy: futurize get_ranges(inet_address ep, token_metadata_ptr)
It is called only from repair, in a thread,
so it can be made always async and the need_preempt param
can be dropped.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 16:10:06 +03:00
Benny Halevy
dfdc8d4ddb abstract_replication_strategy: move get_ranges and get_primary_ranges* to effective_replication_map
Provide a sync get_ranges method by effective_replication_map
that uses the precalculated map to get all token ranges owned by or
replicated on a given endpoint.

Reuse do_get_ranges as common infrastructure for all
3 cases: get_ranges, get_primary_ranges, and get_primary_ranges_within_dc.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 16:09:51 +03:00
Benny Halevy
0e5bb94e84 abstract_replication_strategy: get rid of cached_endpoints
Now that do_get_natural_endpoints is gone, the cached
endpoints are no longer in use.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 14:15:34 +03:00
Benny Halevy
25227ab5ea all replication strategies: get rid of do_get_natural_endpoints
Now that all falvors of get_natural_endpoints methods
were moved to effective_replication_map,
do_get_natural_endpoints and its overrides are unused.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 14:13:51 +03:00
Benny Halevy
aab363753f abstract_replication_strategy: move get_natural_endpoints_without_node_being_replaced to effective_replication_map
Use the precalculated endpoints map there
as well as the token_metadata_ptr.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 14:10:01 +03:00
Benny Halevy
bb0ea0b1c0 shared_token_metadata: set: check version monotonicity
Setting the ring version backwards means it got out of sync.
Possibly concurrent updates weren't serialized properly
using token_metadata_lock / mutate_token_metadata.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 14:03:51 +03:00
Benny Halevy
43160abaec token_metadata: use static ring version
For generating unique _ring_version.

Currently when we clone a mutable token_metadata_ptr
it remains with the same _ring_version
and the ring version is updated only when the topology changes.

To be able to distinguish these traqnsient copies
from the ones that got applied, be stricter about
the ring version and change it to a unique number
using a static counter.

Next patch will update the ring version
(and consequently invalidate the cached_endpoints
on the replication strategy) every time the token_metadata
changes, not only when the topology changes.

Note that the _cached_endpoints will go away
once the transition to effective_replication_map
is finished, so this will not degrade performance.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 14:03:17 +03:00
Benny Halevy
685f5e7704 token_metadata: get rid of copy constructor and assignment operator
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 14:00:55 +03:00
Benny Halevy
d74ecfbc29 abstract_replication_strategy: get rid of legacy get_natural_endpoints
implementation

Now that all users of it were converted to use the
effective_replication_map, the legacy
abstract_replication_strategy::get_natural_endpoints method
can be deleted.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 13:58:18 +03:00
Benny Halevy
4b838197e2 storage_service: update keyspaces effective_replication_map on token_metadata change
Every time the token_metadata changes we need to update the
effective_replication_map on all non-system keyspaces.

Do that in replicate_to_all_cores after the updated token_metadata
has been replicated to all cores.

We first prepare and clone the token_metadata, then prepare
and clone the new effective_replication_maps.  Any failure
at this stage is recoverable, handle via rollback and the exception
is returned.

Note that any failure to _apply_ the pending token_metadata or the
effective_replication_map will cause scylla to abort.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 13:05:28 +03:00
Benny Halevy
3393df45eb token_metadata, storage_service: unify token_metadata_lock and merge_lock.
Serialize the metadata changes with
keyspace create, update, or drop.

This will become necessary in the following patch
when we update the effective_replication_map
on all keyspaces and we want instances on all shards
end up with the same replication map.

Note that storage_service::keyspace_changed is called
from the scheme_merge path so it already holds
the merge_lock.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 13:01:25 +03:00
Benny Halevy
1e1d7d7df5 abstract_replication_strategy: introduce effective_replication_map
effective_replication_map holds the full replication_map
resulting from applying the effective replication strategy
over the given token_metadata and replication_strategy_config_options.

It is calculated once, in make_effective_replication_map(), and then it
can be used for retrieving the endpoints/token_ranges synchronously
from the precalculated map.

A new virtual get_natural_endpoints(const token&, const effective_replication_map&)
method has been added to abstract_replication_strategy so that
local_strategy and everywhere_replication_strategy can override it as they may be
needed before the token_metadata is established.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 12:53:03 +03:00
Benny Halevy
d96a67eb57 abstract_replication_strategy: use shared_ptr in registry
Enable creating shared_ptr<BaseClass> in nonstatic_class_registry
using BaseClass::ptr_type and use that for
abstract_replication_strategy.

While at it, also clean up compressor with that respect
to define compressor::ptr_type as shared_ptr<compressor>
thus simplifying compressor_registry.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 12:39:36 +03:00
Benny Halevy
a1c573e6d3 abstract_replication_strategy: make calculate_natural_endpoints_sync private
And with that rename calculate_natural_endpoints(const token& search_token, const token_metadata&, can_yield)
to do_calculate_natural_endpoints and make it protected,

With this patch, all its external users call the async version, so
rename it back to calculate_natural_endpoints, and make
calculate_natural_endpoints_sync private since it's being called
only within abstract_replication_strategy.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 12:39:36 +03:00
Benny Halevy
a1098c0094 replication strategies: calculate_natural_endpoints: split into sync and async variants
calculate_natural_endpoints_sync and _async are both provided
temporarily until all users of them are converted to use
the async version which will remain.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 12:39:36 +03:00
Benny Halevy
32c7314b80 network_topology_strategy: refactor calculate_natural_endpoints
Extract natural_endpoints_tracker out of calculate_natural_endpoints
so we easily split the function to sync and async variants.

Test: network_topology_strategy_test(dev, debug)

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 12:39:36 +03:00
Benny Halevy
416531cce7 network_topology_strategy: use rslogger to debug-log configuration
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 12:39:36 +03:00
Benny Halevy
330d9772d4 abstract_replication_strategy: move logger to locator namespace
To be used by network_topology_strategy and later, by
effective_replication_map_registry.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 12:39:36 +03:00
Benny Halevy
7401d03e8c abstract_replication_strategy: define replication_map
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 12:39:36 +03:00
Benny Halevy
5001d261d4 abstract_replication_strategy: define replication_strategy_config_options
To be used for searching effective replication strategy instances.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 12:39:36 +03:00
Avi Kivity
3f9ec5302a locator: azure_snitch: use full type name in definition of globals
Some globals in azure_snitch use std::string in the declaration
and auto in the definition. gcc 11 complains. I don't know if it's
correct, but it's easy to use the type in both declaration and
definition.
2021-10-10 18:16:50 +03:00
Avi Kivity
369afe3124 treewide: use coroutine::maybe_yield() instead of co_await make_ready_future()
The dedicated API shows the intent, and may be a tiny bit faster.

Closes #9382
2021-09-23 12:28:56 +02:00
Avi Kivity
daf028210b build: enable -Winconsistent-missing-override warning
This warning can catch a virtual function that thinks it
overrides another, but doesn't, because the two functions
have different signatures. This isn't very likely since most
of our virtual functions override pure virtuals, but it's
still worth having.

Enable the warning and fix numerous violations.

Closes #9347
2021-09-15 12:55:54 +03:00
Benny Halevy
e613c0d287 abstract_replication_strategy: remove commented out keyspace* member
It is not needed.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210906133840.3307279-2-bhalevy@scylladb.com>
2021-09-06 16:51:22 +03:00
Benny Halevy
b7eaa22ce6 abstract_replication_strategy: create_replication_strategy: drop keyspace name parameter
It is not used.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210906133840.3307279-1-bhalevy@scylladb.com>
2021-09-06 16:51:21 +03:00
Benny Halevy
4ffdafe6dc token_metadata: delete old java code
We no longer need to keep it for reference.
It's just causing confusion at this point.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210826095457.994834-1-bhalevy@scylladb.com>
2021-08-26 13:03:59 +03:00
Asias He
6230bd4b5a locator: Add yield in do_get_ranges and friends
Not all calculate_natural_endpoints implementations respect can_yield
flag, for example, everywhere_replication_strategy.

This patch adds yield at the caller site to fix stalls we saw in
do_get_ranges.

Fixes #8943

Closes #9139
2021-08-04 15:52:37 +03:00
Nadav Har'El
9662de85f5 Merge 'Azure snitch support' from Pekka Enberg
This add support for Azure snitch. The work is an adaptation of
AzureSnitch for Apache Cassandra by Yoshua Wakeham:

https://raw.githubusercontent.com/yoshw/cassandra/9387-trunk/src/java/org/apache/cassandra/locator/AzureSnitch.java

Also change `production_snitch_base` to protect against
a snitch implementation setting DC and rack to an empty string,
which Lubos' says can happen on Azure.

Fixes #8593

Closes #9084

* github.com:scylladb/scylla:
  scylla_util: Use AzureSnitch on Azure
  production_snitch_base: Fallback for empty DC or rack strings
  azure_snitch: Azure snitch support
2021-08-03 22:52:05 +03:00
Pekka Enberg
42e32566f6 production_snitch_base: Fallback for empty DC or rack strings
Lubos Kosco points out that on Microsoft Azure, for example, it is
possible for the "zone metadata" (which we use as rack information) can
be empty as shown in:

https://docs.microsoft.com/en-us/azure/virtual-machines/windows/instance-metadata-service?tabs=windows#instance-metadata

Therefore, protect against empty DC or rack strings in
`production_snitch_base` to keep the behavior consistent across
different snitches.
2021-07-28 14:07:42 +03:00
Pekka Enberg
e44fa8d806 azure_snitch: Azure snitch support
This add support for Azure snitch. The work is an adaptation of
AzureSnitch for Apache Cassandra by Yoshua Wakeham:

https://raw.githubusercontent.com/yoshw/cassandra/9387-trunk/src/java/org/apache/cassandra/locator/AzureSnitch.java

As per Lubos' suggestion, we switched to a later API version.
2021-07-28 14:07:42 +03:00