Commit Graph

414 Commits

Author SHA1 Message Date
Pavel Emelyanov
e502047c74 snitch: Use local gossiper in drivers
Each driver has a pointer to this shard snitch_ptr which, in turn, has
the reference on gossiper. This lets drivers stop using the global
gossiper instance.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-05-03 10:57:40 +03:00
Pavel Emelyanov
38c77d0d85 snitch: Keep gossiper reference
The reference is put on the snitch_ptr because this is the sharded<>
thing and because gossiper reference is the same for different snitch
drivers. Also, getting gossiper from snitch_ptr by driver will look
simpler than getting it from any base class.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-05-03 10:57:40 +03:00
Pavel Emelyanov
f85e12ffa5 snitch: Move snitch_base::get_endpoint_info()
This method is only needed by production_snitch_base inheritants

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-05-03 10:34:52 +03:00
Avi Kivity
1e1c0226a6 treewide: abort() after switch in formatters
It is typical in switch statements to select on an enum type and
rely on the compliler to complain if an enum value was missed. But
gcc isn't satisified since the enum could have a value outside the
declared list. Call abort() in this impossible situation to pacify
it.
2022-04-18 12:27:18 +03:00
Pavel Emelyanov
828a951886 snitch: Remove create_snitch/stop_snitch
After previous patches both, create_snitch() and stop_snitch() no look
like the classica sharded service start/stop sequence. Finally both
helpers can be removed and the rest of the user can just call start/stop
on locally obtained sharded references.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-04-11 14:43:25 +03:00
Pavel Emelyanov
20e623f16d snitch: Simplify stop (and pause_io)
Both first stop/pause snitch driver on io-ing shard, then proceed with
the rest. This sequence is pretty pointless and here's why.

The only non-trivial stop()/pause_io() method out there is in the
property-file snitch driver. In it, both methods check if the current
shard is the io-ing one, if no -- return back the resolved future, if
yes -- go ahead and stop/pause some IO. With this, for all shards but
io-ing one there's no point in starting after io-ing one is stopped,
they all can start (and finish) in parallel.

So what this patch does is just removes the pre-stop/pause kicking of
the io-ing shard.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-04-11 14:43:23 +03:00
Pavel Emelyanov
2e42578dc8 snitch: Move io_is_stopped to property-file driver
This whole engine is only used by that driver, there's no point in it
sitting on the base class

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-04-11 14:43:20 +03:00
Pavel Emelyanov
28ecdc66ad snitch: Remove init_snitch_obj()
Now it's just a wrapper around sharded<snitch_ptr>::start()

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-04-11 14:43:16 +03:00
Pavel Emelyanov
b3eaae629e snitch: Move instance creation into snitch_ptr constructor
Current API to create snitch is not like other services -- there's a
dedicated helper that does sharded<>.start() + invoke_on_all(&start)
calls. These helpers complicate do-globalization of snitch and rework
of services start-stop sequence, things get simpler if snitch uses
the same start-stop API as all the others. The first step towards this
change is moving the non-waiting parts of snitch initialization code
from init_snitch_obj() into snitch_ptr constructor.

A note on this change: after patch #2 the snitch_ptr<->driver linkage
connects local objects with each other, not container() of any. This
is important, because connecting container() would be impossible inside
constructor, as the container pointer is initialized by seastar _after_
the service constructor itself.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-04-11 14:38:35 +03:00
Pavel Emelyanov
633746b87d snitch: Make config-based construction of all drivers
Currently snitch drivers register themselves in class-registry with all
sorts of construction options possible. All those different constuctors
are in fact "config options".

When later snitch will declare its dependencies (gossiper and system
keyspace), it will require patching all this registrations, which's very
inconvenient.

This patch introduces the snitch_config struct and replaces all the
snitch constructors with the snitch_driver(snitch_config cfg) one.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-04-11 14:38:34 +03:00
Pavel Emelyanov
fa59ccb89d snitch: Declare snitch_ptr peering and rework container() method
This patch makes the snitch base class reference local snitch_ptr, not
its sharded<> container and, respectively, makes the base container()
method return _backreference->container() instead.

The motivation of this change is, again, in the next patch, which will
move snitch_ptr<->driver_object linkage into snitch_ptr constructor.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-04-11 14:38:32 +03:00
Pavel Emelyanov
552a08ecd0 snitch: Introduce container() method
Some snitch drivers want the peering_sharded_service::container()
functionality, but they can't directly use it, because the driver
class is in fact the pimplification behind the sharded<snitch_ptr>
service. To overcome this there's a _my_distributed pointer on the
driver base class that points back to sharded<snitch_ptr> object.

This patch replaces the direct _my_distributed usage with the
container() method that does it and also asserts that the pointer
in question is initialized (some drivers already do it, some don't).

Other than making the code more peering_sharded_service-like, this
patch allows changing _my_distributed into _backreference that
points to this shard's snitch_ptr, see next patch.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-04-11 14:38:27 +03:00
Pavel Emelyanov
05a32328fc snitch: Remove gossiper_starting()
No longer used

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-04-01 13:16:09 +03:00
Pavel Emelyanov
41332e183a snitch: Remove gossip_snitch_info()
No longer in use

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-04-01 13:16:09 +03:00
Pavel Emelyanov
38b0ee9822 property-file snitch: Re-gossip states with the help of .get_app_states()
This is the last place that still uses gossip_snitch_info(). It
can be reworked to use the get_app_states(), then the former
helper can be removed.

Another motivation for this is to stop using the _gossiper_started
boolean from the base class. This, in turn, will allow to remove
the whole gossiper_starting() notification altogether.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-04-01 13:16:09 +03:00
Pavel Emelyanov
6f71baa472 property-file snitch: Reload state in .start()
In its .start() helper the property-file driver does everything but
registers the reconnectable helper (like the ec2 m.r. one from the
previous patch did). Similarly to ec2 m.r. snitch this one can also
register its helper in .start(), before gossiper_starting() is called.

One thing to care about in this driver is that some tests start this
snitch without starting gossiper, thus an extra protection against
not initialized gossiper is needed.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-04-01 13:16:09 +03:00
Pavel Emelyanov
2400c87e74 ec2 multi-region snitch: Register helper in .start()
This driver registers reconnectable helper in it gossiper_starting()
callback. It can be done earlier -- in the snitch .start() one, as
gossiper doesn't notify listeners until its started for real (event
its shardow round doesn't kick them).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-04-01 13:16:05 +03:00
Pavel Emelyanov
f9af6fb430 snitch, storage service: Gossip snitch info once
Nowadays snitch states are put into gossiper via .gossiper_starting()
call by gossiper. This, in turn, happens in two places -- on node
ring join code and on re-enabling gossiper via the API call.

The former can be performed by the ring joining code with the help of
recently introduced snitch.get_app_states() helper.

The latter call is in fact not needed. Re-gossiped are DC, RACK and
for some drivers the INTERNAL_IP states that don't change throughout
snitch lifetime and are preserved in the gossiper pre-loaded states.

Thus, once the snitch states are applied by storage service ring join
code, the respective states udpate can be removed from the snitch
gossiper_starting() implementations.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-04-01 13:16:05 +03:00
Pavel Emelyanov
4853959903 snitch: Introduce get_app_states() method
This virtual method returns back the list of app states that snitch
drivers need to gossip around. The exact implementation copies the
gossip_snitch_info() logic of the respective drivers and is unused.
Next patches will make use of it (spoiler: the latter method will be
removed after that).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-04-01 13:16:05 +03:00
Pavel Emelyanov
028bb84b0f property-file snitch: Use _my_distributed to re-shard
The driver in question wants to execute some of its actions on shard 0
and it calls smp::invoke(0, ...) for this. The invoked lambda thus needs
to refer to global snitch instance.

There's nicer and shorter way of re-sharding for snith drivers -- the
sharded<snith_ptr>* _my_distributed field on the base class.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-04-01 13:16:05 +03:00
Pavel Emelyanov
021c026482 system_keyspace,snitch: Make load_dc_rack_info non-static
It's snitch code that needs it. It now takes messaging service
from gossiper, so it can do the same with system keyspace. This
change removes one user of the global sys.ks. cache instance.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-03-25 15:08:13 +03:00
Benny Halevy
3cee0f8bd9 shared_token_metadata: mutate_token_metadata: bump cloned copy ring_version
Currently this is done only in
storage_service::get_mutable_token_metadata_ptr
but it needs to be done here as well for code paths
calling mutate_token_metadata directly.

Currently, this it is only called from network_topology_strategy_test.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20220130152157.2596086-1-bhalevy@scylladb.com>
2022-01-30 18:15:08 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Benny Halevy
17e006106b token_metadata: update_normal_tokens: avoid unneeded sort when token ownership doesn't change
Currently, we first delete all existing token mappings
for the endpoint from _token_to_endpoint_map and then
we add all updated token mappings for it and set should_sort_tokens
if the token is newly inserted, but since we removed all
existing mappings for the endpoint unconditionally, we
will sort the tokens even if the token existed and
its ownership did not change.

This is worthwhile since there are scenarios where
none of the token ownership change.  Searching and
erasing tokens from the tokens unordered_set runs
at constant time on average so doing it for n tokens
is O(n), while sorting the tokens is O(n*log(n)).

Test: unit(dev)
DTest: replace_address_test.py::TestReplaceAddress::test_serve_writes_during_bootstrap(dev,debug)

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20220117101242.122512-2-bhalevy@scylladb.com>
2022-01-17 12:18:42 +02:00
Benny Halevy
25977db7b4 token_metadata: remove update_normal_token entry point
It's currently used only by unit tests
and it is dangerous to use on a populated token_metadata
as update_normal_tokens assumes that the set of tokens
owned by the given endpoint is compelte, i.e. previous
tokens owned by the endpoint are no longer owned by it,
but the single-token update_normal_token interface
seems commulative (and has no documentation whatsoever).

It is better to remove this interface and calculate a
complete map of endpoint->tokens from the tests.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20220117101242.122512-1-bhalevy@scylladb.com>
2022-01-17 12:18:42 +02:00
Pavel Solodovnikov
badbfd521c locator: reconnectable_snitch_helper: coroutinize reconnect
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2022-01-11 09:29:12 +03:00
Pavel Solodovnikov
5dcfb94d5a gms: i_endpoint_state_change_subscriber: make callbacks to return futures
Coroutinize a few simple callbacks in the process.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2022-01-11 09:29:12 +03:00
Avi Kivity
57188de09e Merge 'Make dc/rack encryption work for some cases where Nat hides ednpoint ips' from Eliran Sinvani
This is a consolidation of #9714 and #9709 PRs by @elcallio that were reviewed by @asias
The last comment on those was that they should be consolidated in order not to create a security degradation for
ec2 setups.
For some cases it is impossible to determine dc or rack association for nodes on outgoing connections.
One example is when some IPs are hidden behind Nat layer.
In some cases this creates problems where one side of the connection is aware of the rack/dc association where the
other doesn't.
The solution here is a two stage one:
1. First add a gossip reverse lookup that will help us determine the rack/dc association for a broader (hopefully all) range
 of setups and NAT situations.
2. When this fails - be more strict about downgrading a node which tries to ensure that both sides of the connection will at least
 downgrade the connection instead of just fail to start when it is not possible for one side to determine rack/dc association.

Fixes #9653
/cc @elcallio @asias

Closes #9822

* github.com:scylladb/scylla:
  messaging_service: Add reverse mapping of private ip -> public endpoint
  production_snitch_base: Do reverse lookup of endpoint for info
  messaging_service: Make dc/rack encryption check for connection more strict
2022-01-09 16:40:49 +02:00
Avi Kivity
bbad8f4677 replica: move ::database, ::keyspace, and ::table to replica namespace
Move replica-oriented classes to the replica namespace. The main
classes moved are ::database, ::keyspace, and ::table, but a few
ancillary classes are also moved. There are certainly classes that
should be moved but aren't (like distributed_loader) but we have
to start somewhere.

References are adjusted treewide. In many cases, it is obvious that
a call site should not access the replica (but the data_dictionary
instead), but that is left for separate work.

scylla-gdb.py is adjusted to look for both the new and old names.
2022-01-07 12:04:38 +02:00
Avi Kivity
ae3a360725 database: Move database, keyspace, table classes to replica/ directory
The database, keyspace, and table classes represent the replica-only
part of the objects after which they are named. Reading from a table
doesn't give you the full data, just the replica's view, and it is not
consistent since reconciliation is applied on the coordinator.

As a first step in acknowledging this, move the related files to
a replica/ subdirectory.
2022-01-06 17:07:30 +02:00
Nadav Har'El
6012f6f2b6 build performance: do not include <seastar/net/ip.hh>
In a previous patch, we noticed that the header file <gm/inet_address.hh>,
which is included, directly or indirectly, by most source files,
includes <seastar/net/ip.hh> which is very slow to compile, and
replaced it by the much faster-to-include <seastar/net/ipv[46]_address.hh>.

However, we also included <seastar/net/ip.hh> in types.hh - and that
too is included by almost every file, so the actual saving from the
above patch was minimal. So in this patch we replace this include too.
After this patch Scylla does not include <seastar/net/ip.hh> at all.

According to ClangBuildAnalyzer, this reduces the average time to include
types.hh (multiply this by 312 times!) from 4 seconds to 1.8 seconds,
and reduces total build time (dev mode) by about 3%.

Some of the source files were now missing some include directives, that
were previously included in ip.hh - so we need to add those explicitly.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-01-05 17:29:21 +02:00
Calle Wilund
4df008adcc production_snitch_base: Do reverse lookup of endpoint for info
Refs #9709
Refs #9653

If we don't find immediate info about an endpoint, check if
we're being asked about a "private" ip for the endpoint.
If so, give info for this.
2021-12-20 06:20:46 +02:00
Benny Halevy
044e4a6b72 token_metadata: delete private constructor
It is not used.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20211205174306.450536-1-bhalevy@scylladb.com>
2021-12-05 19:49:29 +02:00
Benny Halevy
93367ba55f effective_replication_map_factory: temporarily unregister outstanding maps when destroyed
The next patch will disable stopping the keyspaces
in database shutdown due to #9684.

This will leave outstanding e_r_m:s when the factory
is destroyed. They must be unregistered from the factory
so they won't try to submit_background_work()
to gently clear their contents.

Support that temporarily until shutdown is fixed
to ensure they are no outstanding e_r_m:s when
the factory is destroyed, at which point this
can turn into an internal error.

Refs #8995
Refs #9684

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20211127083348.146649-1-bhalevy@scylladb.com>
2021-11-29 11:59:44 +02:00
Benny Halevy
9d2631daaf token_metadata: calculate_pending_ranges_for_leaving: maybe yield
We see long stalls as reported in
https://github.com/scylladb/scylla/issues/8030#issuecomment-974783526

everywhere_replication_strategy::calculate_natural_endpoints
is synchronous and doesn't yield, so add maybe_yield() calls
when looping over many token ranges.

Refs #8030

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20211121090339.3955278-1-bhalevy@scylladb.com>
Message-Id: <20211121102606.76700-1-bhalevy@scylladb.com>
2021-11-22 10:48:25 +02:00
Benny Halevy
eed3e95704 effective_replication_map: clear_gently when destroyed
Prevent reactor stalls by gently clearing the replication_map
and token_metadata_ptr when the effective_replication_map is
destroyed.

This is done in the background, protected by the
effective_replication_map_factory::stop() method.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-19 10:52:41 +02:00
Benny Halevy
866e1b8479 effective_replication_map_factory: try cloning replication map from shard 0
Calculating a new effective_replication_map on each shard
is expensive.  To try to save that, use the factory key to
look up an e_r_m on shard 0 and if found, use to to clone
its replication map and use that to make the shard-local
e_r_m copy.

In the future, we may want to improve that in 2 ways:
- instead of always going to shard 0, use hash(key) % smp::count
to create the first copy.
- make full copies only on NUMA nodes and keep a shared pointer
on all other shards.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-19 10:52:41 +02:00
Benny Halevy
6754e6ca2b effective_replication_map: erase from factory when destroyed
The effective_replication_map_factory keeps nakes pointers
to outstanding effective_replication_map:s.
These are kept valid using a shared effective_replication_map_ptr.

When the last shared ptr reference is dropped the effective_replication_map
object is destroyed, therefore the raw pointer to it in the factory
must be erased.

This now happens in ~effective_replication_map when the object
is marked as registered.

Registration happens when effective_replication_map_factory inserts
the newly created effective_replication_map to its _replication_maps
map, and the factory calles effective_replication_map::set_factory..

Note that effective_replication_map may be created temporarily
and not be inserted to the factory's map, therefore erase
is called only when required.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-19 10:52:20 +02:00
Benny Halevy
8a6fbe800f effective_replication_map_factory: add create_effective_replication_map
Make a factory key using the replication_strategy type
and config options, plus the token_metadata ring version
and use it to search an already-registred effective_replication_map.

If not found, calculate a new create_effective_replication_map
and register it using the above key.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-19 10:46:51 +02:00
Benny Halevy
ecba37dbfd effective_replication_map: enable_lw_shared_from_this
So a effective_replication_map_ptr can be generated
using a raw pointer by effective_replication_map_factory.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-19 10:46:51 +02:00
Benny Halevy
f4f41e2908 effective_replication_map: define factory_key
To be used to locate the effective_replication_map
in the to-be-introduced effective_replication_map_factory.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-19 10:46:51 +02:00
Benny Halevy
3fed73e7c2 locator: add effective_replication_map_factory
It will be used further to create shared copies
of effective_replication_map based on replication_strategy
type and config options.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-11-19 10:46:51 +02:00
Avi Kivity
36919a4ed7 locator: replace seastar::sprint() with fmt::format()
sprint() is obsolete.
2021-10-27 17:02:00 +03:00
Benny Halevy
dc091fc952 effective_replication_map, abstract_replication_strategy: get_ranges: call on_internal_error in empty sorted_tokens case
Accessing tm.sorted_tokens().back() causes undefined behavior
if tm.sorted_tokens is empty.

Check that first and throw/abort using on_internal_error
in this case.

This will prevent the segfault but it doesn't fix the root cause
which is getting here with empty token_metadata.  That will be fixed
by the following patch.

Refs #9494

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20211019075710.1626808-1-bhalevy@scylladb.com>
2021-10-19 18:52:59 +03:00
Benny Halevy
e4dc81ec04 abstract_replication_strategy: add to_qualified_class_name
And use it from cql3 check_restricted_replication_strategy and
keyspace_metadata ctor that defined their own `replication_class_strategy`.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-18 12:13:25 +03:00
Benny Halevy
17296cba4b effective_replication_map: add get_range_addresses
Equivalent to abstract_replication_strategy get_range_addresses,
yet synchronous, as it uses the precalculated map.

Call it from storage_service::get_new_source_ranges
and range_streamer::get_all_ranges_with_sources_for.

Consequently, get_new_source_ranges and removenode_add_ranges
can become synchronous too.

Unfortunately we can't entirely get rid of
abstract_replication_strategy::get_range_addresses
as it's still needed by
range_streamer::get_all_ranges_with_strict_sources_for.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 16:10:06 +03:00
Benny Halevy
8c85197c6c abstract_replication_strategy: get rid of shared_token_metadata member and ctor param
It is not used any more.

Methods either use the token_metadata_ptr in the
effective_replication_map, or receive an ad-hoc
token_metadata.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 16:10:06 +03:00
Benny Halevy
91f2fd5f2c abstract_replication_strategy: recognized_options: pass const topology&
Prepare for deleting the _shared_token_metadata member.
All we need for recognized_options is the topology
(for network_topology_strategy).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 16:10:06 +03:00
Benny Halevy
4d2561ff75 abstract_replication_strategy: precacluate get_replication_factor for effective_replication_map
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 16:10:06 +03:00
Benny Halevy
d953e7b01a token_metadata: get rid of now-unused sync methods
Now that abstract_replication_strategy methods are all async
clone_only_token_map_sync, and update_normal_tokens_sync
are unused.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-10-13 16:10:06 +03:00